Research Methods Class #3

Download Report

Transcript Research Methods Class #3

1

Research Methods Class #3

Carolyn R. Fallahi, Ph. D.

2

Validity in Experimental Design

Type of validity Internal Validity Construct Validity External Validity

3

Internal Validity

 Validity = the correctness or truth of an inference as it relates to the IV and DV.

 Validity asks the question: Are these inferences accurate and correct?

4

Internal Validity

  Confounding = the experiment contains a variable that systematically varies with the IV. This is an important point b/c extraneous variables may or may not introduce a confound within an experiment.

5

Threats to Internal Validity

 Subject characteristics threat  P. 179 of your book  Things like age, strength, maturity, gender, ethnicity, coordination, speed, intelligence, vocab, attitude, reading ability, fluency, manual dexterity, SES, religious beliefs, political beliefs.

6

Control

 Shaddish et al. (2002) have identified a number of extraneous variables that can affect a study.

7

Shaddish et. Al.

 History: refers to any event that occurs between the beginning of experimental treatment and the measurement of the DV that could produce the observed outcome.

 For example, Shadish and Reish (1984).

8

Shaddish study

 A history threat can also occur in a study that is designed to have both pre and postmeasurement of the DV.

 For example, Schoenthaler (1983) investigated the impact of dietary change on violent and aggressive behaviors of institutionalized juveniles.

9

Maturation

  Maturation: refers to changes in the internal conditions of the individual that occur as a function of the passage of time.

Also called (in book) maturation threat.

10

Maturation

 Maturation: refers to changes in the internal conditions of the individual that occur as a function of the passage of time.

11

Instrumentation

  Instrumentation: refers to changes that occur over time in the measurement of the DV.

Any changes that can create problems due to the instrument = also called instrument decay.

12

Instrumentation

 The measurement situation that is most subject to the instrumentation source of error is one that requires that use of human observers.

 Any changes that can create problems due to the instrument = instrument decay.

13

Testing

 Testing: refers to changes in the score a participant makes on the second administration of a test as a result of previously having taken he test.

 Also called (in book) testing threat.

Regression Artifact

   Many psychological experiments (such as attitude change experiments) require the pre- and posttesting on the same DV measure or some other equivalent form for the purpose of measuring change. In addition, these studies sometimes select only 2 groups of research participants having the extreme scores, such as high and low attitude scores.

The two extreme scoring groups are then given an experimental treatment condition, and a posttest score is obtained.

14

Regression Artifact

 A variable that could cause the pre-and posttest scores of the extreme groups to change is regression artifact.

 Also called (in book) regression threat.

 Regression artifact refers to the fact that extreme scores in a particular distribution will tend to move, or regress, toward the mean of the distribution as a function of repeated testing.

15

16

Attrition

 Attrition (or mortality threat as stated in your book): refers to the fact that some individuals do not complete the experiment for a variety of reasons such as failure to show up at the scheduled time and place or not participating in all phases of the study.

17

Selection

 Selection: exists when a differential selection procedure is used for placing research participants in the various comparison groups.

Additive and Interactive Effects

 Refers to the fact that the threats to internal validity can combine to produce complex biases. Validity threats dk necessarily operate in isolation; they can operate simultaneously.

 For example, selection can combine with a maturation, history, or instrumentation effect.

18

19

Additive and Interactive Effects

 To illustrate a selection-maturation effect, suppose you want to teach the concepts of good and bad to 5 year old children with and without hearing difficulties.

 Kusche and Greenberg (1983)

Construct Validity

 Construct validity: concerned with the extent to which these operations represent and therefore can be used to infer the higher-order constructs they describe.

 For example, a person who has an income below the poverty level for 6 months a good representative of the construct of a disadvantaged person?

20

Threats to the Experimental Situation

 1. Reactivity to the Experimental Situation: refers to the fact that the motives and perceptions that research participants bring with them to the experiment can influence their perception of the experiment and the responses they make to the DV.

21

Threats to the Experimental Situation

 2. Participant Effect: In an experiment, the research would like to have ideal participants – participants who bring preconceived notions to the lab and who accept instructions and are motivated to respond in as truthful a manner as possible.

 Demand Characteristics (Orne, 1962)  Example: Christensen (1977) / Bradley (1978) 22

Threats to the Experimental Situation

  3. Conditions producing a positive self presentation motive. Example: Tedeschi, Schlenker, & Bonoma (1971) conditions under which a subject may use the self-presentation bias.

 Behavior is indicative of the subject’s true intentions  Beliefs  feelings 23

Threats to the Experimental Situation

 Experimenter Effects: Remember, Ss who are used in psychological research are usually not apathetic or willing to passively accept and follow the E’s instructions.

24

Threats to the Experimental Situation

 Lyons (1964) states that the E wants research Ss to be perfect servants – intelligent individuals who will cooperate and maintain their position without becoming hostile or negative. It is easy to see why such a desire exists.

25

Threats to the Experimental Situation

 The E has expectations regarding the outcome of the experiment.  Example the Clever Hans story. 26

Threats to the Experimental Situation

 The ways that an E can potentially bias the results of an experiment can be divided into 2 types: bias arising from the attributes of the E and bias resulting from the expectancy of the E.

 27 Experimenter Attributes: the physical and psychological characteristics of an E that may interact with the IV to cause differential performance in Research Ss.

Threats to the Experimental Situation

 28 Rosenthal (1966) has proposed 3 categories of attributes.

   Biosocial attributes – E’s sex, age, race, religion.

Psychosocial attributes – Es anxiety level, need for social approval, hostility, authoritarianism, intelligence, social behavior, warmth, etc.

Situational factors – whether or not the S and E have had prior contact, whether or not E is naïve or an experienced researcher, friendly hostile, etc.

Issues with high-stakes testing

 What is high stakes testing?

29  Relying on a single test score to make important decisions about students.

 If a student scores high on one test = honors program.

 If a student scores low on one test = rejection from college, programs, etc.

 Policy makers are doing using more and more high stakes testing for making decisions.

30

Reliability

 Consistency and repeatability of results.

 IQ

31

Errors of measurement

 Reliability estimates provide researchers with an idea of how much variation to expect.

 Such estimates are expressed as a correlation coefficient called reliability coefficient.

32

Test-retest reliability

 When a measurement yields similar results with repeated testing, we say it shows test-retest reliability.

 split-half reliability

Interrater or interobserver reliability

 If data from two observers tend to agree, their measurements show inerrater or interobserver reliability.

33

Interrater or interobserver reliability

 SATs: Questions of reliability and validity.

34

35

What about the SATs?

 In general, the SAT scores do predict how well students perform in their first year of college, but it does not provide enough information by itself to make good predictions about college performance.

Equivalent forms

 Equivalent forms assesses reliability by comparing the consistency of the scores obtained from people who have been measured on 2 equivalent forms of the test.

 They take both forms of the test on one occasion and then the scores from both forms are compared.

36

Equivalent forms

 If both forms are constructed to be equivalent, they should yield similar results.

 If they do, then evidence exists that the test is reliable. The primary difficulty with this method is developing two forms of the test that are truly equivalent.

37

Internal consistency

methods

Instead of administering two administration or testing sessions, there are several internal consistency methods of estimating reliability that only require a single administration of an instrument.

38

Split half reliability

 In addition to knowing how well each item correlates with the rest of the items, researchers also need to know how reliable the measure is as a whole.

 Historically, researchers use the split half reliability as an index of inter item reliability.

 Here the items on a test are divided up on the scale with two types.

39

Alpha Coefficient

  To get around this ambiguity, researchers now use Cronbach’s alpha coefficient.

Cronbach’s alpha coefficient is equivalent to the average of all possible split-half reliabilities. As a rule of thumb, researchers consider a measure to have adequate interitem reliability if Cronbach’s alpha coefficient exceeds .70. This is b/c .70 means that 70% of the total variance in S’s scores on the measure is systematic, true score variance.

40

41

Kuder Richardson

 The most common method for determining internal consistency or reliability of a test is the Kuder-Richardson approach, especially formulas KR20 and KR21.

 Kuder-Richardson measures inter-item consistency. It is like doing a split-half reliability on all combinations of items resulting from different splitting of the test.