Transcript Statistics - Ain Shams University
Slide 1
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 2
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 3
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 4
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 5
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 6
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 7
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 8
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 9
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 10
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 11
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 12
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 13
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 14
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 15
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 16
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 17
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 18
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 19
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 20
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 21
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 22
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 23
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 24
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 25
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 26
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 27
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 28
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 29
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 30
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 31
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 32
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 33
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 34
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 35
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 36
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 37
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 38
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 39
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 40
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 41
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 42
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 43
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 44
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 45
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 46
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 47
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 48
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 49
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 50
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 51
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 52
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 53
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 54
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 55
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 56
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 57
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 58
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 59
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 60
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 61
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 62
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 63
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 64
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 65
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 66
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 67
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 68
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 69
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 70
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 71
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 72
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 73
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 74
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 75
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 76
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 77
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 2
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 3
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 4
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 5
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 6
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 7
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 8
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 9
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 10
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 11
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 12
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 13
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 14
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 15
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 16
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 17
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 18
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 19
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 20
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 21
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 22
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 23
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 24
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 25
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 26
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 27
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 28
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 29
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 30
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 31
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 32
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 33
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 34
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 35
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 36
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 37
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 38
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 39
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 40
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 41
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 42
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 43
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 44
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 45
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 46
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 47
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 48
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 49
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 50
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 51
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 52
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 53
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 54
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 55
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 56
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 57
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 58
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 59
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 60
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 61
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 62
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 63
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 64
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 65
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 66
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 67
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 68
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 69
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 70
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 71
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 72
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 73
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 74
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 75
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 76
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you
Slide 77
Statistics
When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University
Objectives
Define statistics and understand its
terminology
Discuss the importance and need of
statistics in medical field
Distinguish types of data and variables
Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.
Collecting
data
Describing
data
Good
Decision
Analyzing
data
Why do we need statistics?
Variability (atropine)
Causes: Uncontrollable (too many) factors
Immeasurable factors
Unknown factors
Why do we need statistics?
Variability (atropine)
Effect of variability
Large amount of data describing the same
thing (many values for one variable)
No certainty “Deterministic vs probabilistic”
Sampling
Functions of statistics
(new hypothetical β-blocker drug)
Describe (Descriptive statistics)
Inference (Inferential statistics)
Variables & Data
A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
Data are the values you get when you
measure a variable
The Variables……
………and the Data
Mrs
Brown
Age
Gender
Blood
type
Mr
Patel
Ms
Manda
32
24
20
Female
Male
Female
O
O
A
Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories
Gender: (dichotomous –binary)
Male
Female
Type of ICU admission
Medical
Surgical
Physical injuries
Poisoning
Others
Types of variables
1. Qualitative variables (data)
b) Ordinal variable
Categorical variable whose values are ordered
Degree of illness
Mild
Moderate
Severe
Physical status
ASA
ASA
ASA
ASA
ASA
I
II
III
IV
V
Glasgow’s coma scale
Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable
Cardiac index
Creatinine clearance
Types of variables
Changing data scales
Sample and Population
A sample is a group (subset) taken
from a population.
Population is the group of ALL
individuals (entities) sharing specific
characteristics
Sample and Population
All human beings
Day case surgical patients undergoing
general anesthesia
Low-risk CABG surgery patients
ICU patients with septic shock
Women undergoing CS under spinal
anesthesia
Human Skeletal muscle fibers
Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
Descriptive statistics
Qualitative variables
Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables
In terms of describing data, an appropriate
chart is almost always a good idea.
What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.
Descriptive statistics
Qualitative variables
Charts
Pie chart
Postoperative Complications
23.5%
27.5%
9.8%
Nausea
Vomiting
Pain
Couph
39.2%
Disadvantage:Advantage:-
one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Postoperative Complications
Number of patients
25
20
15
10
5
0
Nausea
Vomiting
Pain
Alternative
Width
spacing
Couph
Postoperative Complications
Number of patients (%)
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Postoperative Complications
Number of patients (%)
45%
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Postoperative Complications
Number of patients (%)
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Group I
Group II
Couph
Pain
Vomiting
Nausea
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
one variable – no gapping
Descriptive statistics
Qualitative variables
Charts
Pie Chart
Bar Chart
Simple
Clustered bar chart
Stacked bar chart
Histogram
Frequency polygon
Cumulative frequency polygon (Ogive)
Number of patients (%)
Postoperative Complications
Group I
Group II
40%
35%
30%
25%
20%
15%
10%
5%
0%
Nausea
Vomiting
Pain
Couph
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Median
1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10
Mean = 4
Median = 3
Mode = 1
Mid range = 4.5
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
90th percentile
70th quartile
1-2-5-6-8-9-11-13-17-20-22
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables
Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion
(summary measures of spread)
1 – 1 – 3 – 5 - 10
Range = 9
Inter-quartile range = 4
Variance = 11.2
Standard deviation = 3.35
Descriptive statistics
Numerical variables
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Inferential statistics
(Informed guess)
Making inference about population
parameters from sample statistics
Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics
(Informed guess)
Hypothesis testing
Almost all clinical research begins with a question.
For example, is stress a risk factor for breast
cancer?
To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
This usually takes the following form:
H0: Stress is NOT a risk factor for breast cancer
H0: The drug has NO effect on mean heart rate
Inferential statistics
(Informed guess)
Hypothesis testing
Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.
Inferential statistics
(Informed guess)
Hypothesis testing
Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2
Inferential statistics
(Informed guess)
Type I & Type II Errors
The Ho is:
The
Accept
decision
about
reject
Ho
True
false
Good decision
Type II error
Type I error
Good decision
Inferential statistics
(Informed guess)
: Probability of conducting type I error
: Probability of conducting type II error
p–value: the probability of getting the
outcome observed (or one more extreme),
assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics
(Informed guess)
How does it work? Probability distributions
Inferential statistics
(Informed guess)
How does it work?
Parametric vs non-parametric tests
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Comparing two or more factors
Association
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
45
y = 2.03x + 8
40
Weight (kg)
35
30
25
20
15
10
5
0
0
2
4
6
8
Age (y)
10
12
14
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
The test Positive
is:
Negative
Present
Absent
TP
FP
FN
TN
ROC Curve
1.0
1.0
0.8
0.8
0.6
0.6
Sensitivity
Sensitivity
ROC Curve
0.4
0.2
0.4
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
0.0
0.2
0.4
0.6
0.8
1 - Specificity
Diagonal segments are produced by ties.
1.0
Inferential statistics
(Informed guess)
Some example of testing of hypothesis?
Comparisons
One sample
Two independent samples
Two dependent samples
More than two samples (independent-dependent)
Association
Prediction
Diagnostic test (dichotomous – continuous)
Survival analysis (censored data)
Final words
If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
At best, the net effect is to waste time,
effort, and money for the project.
At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.
Thank you