讲座-王占礼

Transcript 讲座-王占礼

A TIME TO LEARN AND SHARE

CIFA 国际货代考试英语卷测量等价性检验 Testing Cross gender and region construct validity in CIFA English test for certification of freight forwarders 王占礼

测量等价（ Measurement Invariance; MI) 测量等价是Drasgow 借用项目反应理论( Item Response Theory)的相似概念首次提出了一个测量学术语 , 是指对于不同的条件下观察和研究的现象, 测量操作产生对同一属性的度量。根据检验的对象不同, 测量等价由低到高构成四个水平 :     形等价

( configural invariance)

弱等价

( weak invariance)

强等价

( strong invariance )

严等价

( strict invariance)

。

等价级别形

态等价又称结构等价

, 是指不同

组的潜变量、显变量之间的基本结构关系相同

, 即每一潜

变量以相同的显变量来测量

, 但不要求

对应参数相等。

弱等价又称因素

负荷等价

, 是指不同

组之间的因素负荷相等

这意味着每一个显变量在

不同的

组之间具有相同的单位

, 潜

变量每变化一个单位

显变量在不同组中都会产生相同程

度的

变化。强等价又

称截距等价 , 是指不同

组之间显变量在由潜变量预测时截距相等。强等价意味

着

测量在不同组之间具有对等的参照点

这样

显变量的跨组差异将可以完全反映所测量的

潜

变量的跨组差异

, 也就是

进行跨组的均数比较是有意义的。严等价又称误差等价

, 是指每一

显变量在不同的组间测量误差具有相同的变异

, 在

这一

水平上跨

组的方差齐性检验是有意义。在统计上

, 四个水平的等价性具有

层级嵌套关系

, 即只有在低一水平的等价性得到

证实后

, 高一水平的等价性

检验才有意义。故测量等价性检验步骤也偱此顺序进行。

构念（

construct

）

Literature review

We often have several groups in our analyses: different cultures, regions or countries.

In order to compare relationships between constructs or means across groups, we need certain level of invariance of the constructs across those groups.

The meaning of invariance is “whether or not, under different conditions of observing (Horn and McArdle 1992, 117).

and studying phenomena, measurement operations yield measures of the same attribute”

Techniques to test invariance

  Various techniques have been developed to test measurement invariance (De Beuckelaer, 2005).

Multiple- group confirmatory factor analysis (MGCFA: Jöreskog 1971) is among the most powerful.

Configural Invariance (1)

 The lowest level of invariance is ‘configural’ invariance.

 Configural invariance requires that the items in the measuring instrument exhibit the same configuration of loadings in each of the different countries.

 That is, the confirmatory factor analysis thus confirms that the same items measure each construct in all countries in the cross-national study (or cross-group).

Configural Invariance (2)

Configural invariance is supported if (a) a single model specifying which items measure each construct fits the data well, (b) all item loadings are substantial and significant, (c) there are no large modification indices, and (d) the correlations between the factors are less than one. The latter requirement guarantees discriminant validity between the factors (Steenkamp and Baumgartner 1998).

Measurement invariance (1)

 Configural invariance does not ensure that the people in different nations understand the items in the same way.

 The factor loadings may still be different across countries.

 The test of the next higher level of invariance, ‘measurement’ or ‘metric’ invariance, requires that the factor loadings between items and constructs are invariant across nations



Measurement invariance (2)

It is tested by constraining the factor loading of each item on its corresponding construct to be the same across groups.

 Measurement invariance is supported if the model cannot be significantly improved by releasing some of the constraints.

Partial measurement invariance

(1)  However, for cross-cultural comparison to be allowed, it is not necessary that all factor loadings are equal.

 Several scholars have suggested that it is enough to have two equal factor loadings per construct across countries to allow comparison of effects.

 They termed it partial measurement (metric) invariance (Byrne, Shavelson, and Muthen 1989; Steenkamp and Baumgartner 1998).

Scalar invariance (1)

 A third level of invariance is necessary to allow mean comparison of the underlying constructs across countries.  This is often a central goal of cross-national research.  Such comparisons are meaningful only if ‘scalar’ invariance of the items is ensured.

 Scalar invariance guarantees that cross-country differences in the means of the observed items are a result of differences in the means of their corresponding constructs.

Scalar invariance (2)

 To assess scalar invariance, one constrains the intercepts of the underlying items to be equal across countries.  It is supported if the model fit to the data is good and if it cannot be improved by releasing some of the equality constraints.

Invariance - summary

 Meaningful comparison of construct means across countries requires three levels of invariance, configural, metric, and scalar.

 Meaningful comparison of relationships between constructs requires two levels of invariance, configural and metric.

 Only if all these types of invariance are supported can we confidently carry out comparisons.

CIFA

考试简介 CIFA 国

际货代考试是由原外经贸部

（

现商务部

）委托，由中国国

际货运代理协会

（ CIFA ）

组织实施的职业认证考试

。自 2002 年实施以来已有近 16 万人参加考试，其中近 6 万人获得证书（中国国际货运代理协会， 2011 ）。考点遍布全国省市，考试得到了业内的高度评价和广泛的认可。参加考试的院校之间也常常进行比较，考试成绩对相关院校的英语教学具有巨大的反馈作用。该考试权威性强、规模大，高风险（ high-stakes ）的特点要求考试必须科学、严谨，尤其对不同群组（性别、区域等）的考生都要公平、公正，具有较好的跨组测量等价性，跨组效度。这样对分数的解释，进行组间差异比较也才有意义。

  AMOS 结构方程模型 (SEM) 包括多种统计技术，如路径分析，验证性因子分析，带潜变量的因果关系模型，甚至方差分析和多重线性回归。 AMOS 即是处理结构方程的一种软件包。 Amos is short for Analysis of Moment Structures. It implements the general approach to data analysis known as structural equation modeling (SEM), also known as analysis of covariance structures, or causal modeling. This approach includes, as special cases, many well known conventional techniques, including the general linear model and common factor analysis.

 The value 0.49 is the correlation between Education and Income. The values 0.72 and0.11 are standardized regression weights. The value 0.60 is the squared multiple correlation of SAT with Education and Income.

模型比较

分析步骤收集数据 Collecting and treating data. 建立理论模型并检验不同组别的拟和程度。 Theoretical model (setting and fitting to various sub population of the test takers) 嵌套模型检验 nested model testing 模型筛选（ model assessment)

建议

Implications 跨性

别等价性

理想，

强等价。

跨区域等价性

较好

，弱等价。

Dif 项目

原因待

查

地雷

Caution

Recent studies suggest that when full or partial measurement invariance is not guaranteed, it may still be the case that constructs are equivalent. Saris and Gallhofer (2007, chapter 16) indicate that the test of measurement invariance is too strict and may fail although cognitive equivalence still holds.

谢谢！ Thank you very much for your attention!

请多多指教！ I would appreciate your comments and advice.