#### Transcript Statistics-MAT 150 Chapter 2 Descriptive Statistics

### Statistics-MAT 150

## Chapter 1 Introduction to Statistics

### Prof. Felix Apfaltrer

### Office:N518 Phone: x7421

# Chapter 1

### • Overview • Nature of data • Skills needed in statistics

### Overview Statistics: • Descriptive

– Analyze nature of data from surveys, experiments, observations,

### • Inferential

– Draw conclusions from the analyses with respect to the population

### Survey: tool to collect data from a smaller group which is part of a larger group to learn something about the larger group

Key goal of statistics: •Learn about a large group (population) from data from from a smaller subgroup (sample)

### Overview

**Definitions**

: • Data: observations collected (measurements, gender, answers,…) • Statistics: collection of methods to analyze data • Population: complete collection of elements (scores, measurements, subjects,…) • Sample: subcollection of members from selected population • Census: collection of data from every member of the population

## Overview 2

**Example:**

• Poll: 1087 adults are asked whether they drink alcoholic beverages or not.

– Sample: 1087 adults – Population: US adults 150 million.

• • Census: Every 10 years, the census bureau tries to collect information from

*every*

member of the US population.

–

**Impossible!**

–

**Very expensive!**

*Use sample data to draw conclusions from whole population: *

*inferential statistics!*

## Types of data

•

**Parameter:**

• A numerical measurement describing some characteristic of the

*population.*

*Lincoln elected: 39.82% of 1,865,908 votes counted.*

–

*39.82% is a parameter.*

**Statistic:**

• • A numerical measurement describing some characteristic of the

*sample.*

*Based on a sample of 877 elected executives, 45% would not hire an applicant with a typographical error in the application.*

–

*45% is a statistic.*

### Types of data 2

**Quantitative data:**

Numbers representing counts or measurements.

• •

*Weights of supermodels.*

**Qualitative data: **

Nonnumerical.

*Gender of an athlete.*

•

**Discrete **

*vs. *

**continuous data**

*# of people in a household *

vs.

*temperatures in May. *

•

**Nominal level **

of measurement: names, labels categories: no ordering.

*Yes/No/Undecided responses, colors.*

**Ordinal level **

of measurement: some order, but numerical values meaningless or nonexistent. • •

*Course grades A, B, C, D, F. “Livability rank of a city”.*

**Interval level **

of measurement: order, but “no 0” or meaningless.

*Temperature, year.*

•

**Ratio level **

of measurement: as before with meaningfull zero.

*Weights, prices (non-negative).*

### Basic skills

**Samples: **

• • • representative:

*“39/40 polled people vote for A” Sampled in A’s headquarters!*

• Not too small:

*CDF published “among HS students suspended, 67% suspended more than 3 times” Sample size: 3!*

**Graphs: **

*In which one does red do better?*

**Median Weekly Income (16-24) Median Weekly Income (16-24)**

$390 $380 $370 $360 $350 $340 $330 $320 $310 $300 $400 $350 $300 $250 $200 $150 $100 $50 $0 Men Women Men Women

**Percentage of: **

• 6 % of 1200 = 6 / 100 * 1200 = 72%

**Fraction >>> percentage: **

• 3/4 = 0.75 >>> 0.75 * 100% = 75 %

**Percentage >>> decimal: **

• 27.3% = 27.3/100 = 0.273

**Decimal >>> percentage: **

• 0.852 >>> 0.852 * 100% = 85.2% • `

**Calculator: **

### Basic skills 2

### Design

•

**Observational study: **

observe and measure characteristics without trying to modify subjects.

*Gallup poll.*

• Cross-sectional:

*data observed, measured at one point in time*

.

• Retrospective:

*data are collected from the past (records)*

• Prospective:

*data collected along the way from groups (smokers/NS)*

•

**Experiment: **

apply treatment and observe and measure effects.

*Clinical trial for Lipitor.*

• Control:

*blinding - placebo, double-blinding, blocks*

• Replication:

*ability to repeat experiment*

• Randomization:

*data *

*needs to be *

*collected in an *

*appropriate (random) *

*way, otherwise it is completely useless!*

– –

*Random sample: *

members of the population are selected so that each individual member has the same chance of being selected.

*Simple random sample of size n :*

every possible random sample of size

*n*

has the same chance of being chosen.

### Design 2

**Sampling:**

• systematic:

*select starting point and every k th member chosen. *

• convenience:

*use easy to get data*

• stratified:

*subdivide population into at least 2 subgroups with common characteristic and draw samples from each (e.g. gender or age)*

• cluster:

*divide population into areas and draw samples form clusters*

**Sampling error: **

the difference between a sample result and the true population result; results from chance sample fluctuations

**Nonsampling error: **

occurs when data is incorrectly collected, measured, recorded or analyzed.