BASICS OF CONTENT ANALYSIS Presented by Natalia Tomlin Assistant Professor and Technical Services Librarian B.

Download Report

Transcript BASICS OF CONTENT ANALYSIS Presented by Natalia Tomlin Assistant Professor and Technical Services Librarian B.

BASICS OF
CONTENT
ANALYSIS
Presented by Natalia Tomlin
Assistant Professor and Technical Services Librarian
B. Davis Schwartz Memorial Library, LIU Post
DEFINING CONTENT
ANALYSIS
• “Summarizing, quantitative analysis of messages that
relies on the scientific method” (Neuendorf, 2002)
• “Technique for the objective, systematic, and quantitative
description of manifest content of communication”
(Berelson, 1952)
• “Research technique for making replicable and valid
inferences from texts (or other meaningful matter) in the
context of their use” (Knippendorff, 2004)
• “Procedures for defining, measuring, and analyzing both
the substance and meaning of texts or messages or
documents”
(Beck and Manuel, 2008)
Kimberly Neuendorf
Klaus Knippendorff
Stone, Dunphy, Smith, and
Ogilvie, 1966
CONTENT ANALYSIS:
QUANTITATIVE OR
QUALITATIVE?
• Quantitative – focus on numerically measurable
objectives
Research questions are stated as hypotheses
Use of inferential statistics
• Qualitative – focus on how the things occur, how
people think about processes, exploratory research,
more holistic, natural approach, use of language as
a primary data, researcher is a part of the project.
Use of verbal categories and descriptive statistics
Content analysis may be quantitative or qualitative
BRIEF HISTORY OF
CONTENT ANALYSIS
• XVII century – analysis of texts by Church
• Speed (1893) “Do newspapers now give the news?”
– content analysis of New York newspapers
• 1930s-1940s – earlier content analysis studies by
sociologists
• World War II – propaganda analysis
• 1950s – use of content analysis by psychologists,
anthropologists, historians, linguists, educators,
psychiatrists, literary critics, library science
• 1958 – first computer-aided content analysis
• Evolution from word count to discovering concepts
CONTENT ANALYSIS:
AREAS OF
IMPLEMENTATION
• Written materials : books, journals, official
documents, advertisements, speeches, conversations
• Visual items – films, clothing, work of arts
• Sound texts, operas, musicals, lyrics
• Combinations of communication content: blogs,
webpages, performance art, computer programs
• Fields: marketing, literature, gender studies,
political science, psychology etc.
MANY PURPOSES OF
CONTENT ANALYSIS
• Disclose international differences in communication
content
• Audit communication against objectives
• Code open-ended questions in survey
• Determine psychological state of a person or group
• Determine existence of propaganda
• Reveal focus of individual groups
• Reflect cultural patterns of groups
• Describe trends in communication content
(Berelson, 1952)
Content
Analysis
Data
Collection
Technique
Research
Methodology
EXAMPLES
OF CONTENT ANALYSIS
STUDIES
• Walker (1975) – differences and similarities in
American black and white popular song lyrics,
1962-1973.
• Aries (1973) – socialization differences in male,
female, and mixed-sex small groups
• Adams and Shriebman (1978) – content analysis of
news media
• Graham, Kamins, Oetomo (1993) – analysis of
advertisements in Japan and Germany
• Horton (1986) – analysis of young adult books
• Kaur-Kasior (1987) – treatment of culture in
greeting cards
CONTENT ANALYSIS IN
LIS
•
•
•
•
•
•
•
•
Turner and Beck (2002) - repair strategies of remote
users searching the online catalog
Sproles and Ratlege (2004) - librarian job ads
Koufogiannakis, Slater, Coumley (2004) - content
analysis of librarianship research
Kuchi (2006) - academic libraries websites
Tancheva (2003) - analysis of online tutorials
Aharony (2009) - blogs of the librarians
LIS thesis and dissertation research (1946-1963)
62% of dissertations used content analysis
Koufogiannakis and Slater (2004) – content analysis
is one of top 5 preferred research methods in
LIS
Content Analysis in LIS
Delivery of
library services
Resourcespecific studies
Studies of
profession
itself
Content
Analysis
Conceptual
Existence and frequency
of concepts
Relational
Relationship among concepts
RESEARCH QUESTIONS
•
Research question
Do technical services jobs require more advanced technology
skills than reference services jobs? (observed reality)
•
Hypothesis : Technical Services jobs require more advance
information technology skills (prediction of
relationship between two variables)
•
Importance of conceptual definitions of variables (exhaustive
and mutually exclusive; previously developed or new)
Coding is based on definitions
•
*** Some content analysis studies may state hypothesis but do
not employ tests for statistical significance
CONTENT ANALYSIS
DESIGN
•
•
•
•
•
•
Unitizing
Sampling
Coding
Reducing
Inferring
Narrating
Data making
UNITS (WHAT IS TO BE
OBSERVED
• Sampling units
(issues of newspapers, blogs, individual speeches –
what to include or exclude in the analysis,)
• Recording units
(blog posts, specific newspaper column)
•
Context units-what can be communicated within the
text
(words, phrases, pictures, ideas)
SAMPLING
Sampling – ability to generalize the properties found in a
sample to the population from which the sample is drawn
•
Random
- Simple random (random numbers generator)
- Systematic (every n-th element is chosen)
- Stratified (division of the population into different
subgroups and then random selection the final
subjects proportionally from the different strata)
•
Non-Random
- Purposive (selected based on the knowledge of a
population and the purpose of the study)
- Convenience
CODING
• Define the recording units (=unit of analysis) (word,
sentence, theme, paragraph, whole text (text must
be short)
• Define categories (variables) (mutually exclusive
and how broad/narrow categories will be)
• Provide conceptual definitions for variables
• Test the scheme on a sample of the text
• Assess accuracy and reliability
• Revise coding rule if needed
• Test again
• Code all text
• Assess reliability and accuracy (2nd time)
EXAMPLE OF
CODEBOOK
Unit of analysis : individual job ad posting
Conceptual definition: each academic library job ad
posted between 2012 and 2013 on CHE website
•
•
•
•
•
•
•
•
Job number
Job posting date
Job category
Type of library
Degree requirement
Professional experience
Preferred degrees
Faculty status
CODEBOOK EXAMPLE
Job number
001
002
Job posting date
1.01.01.2013-01.31.2013
2.02.01.2013-02.29.2013
Job category
1.Administrative
2.Instructional
3.Technical Services
Type of library
1.Research library
2.Community college
3.4-year college
Degree requirement
1.MLS only
2.MLS and one more masters degree
3.Other
Professional experience
1.None
2.1+ years
3.3+ years
CODEBOOK CREATION
• Use of established conceptual definitions is adding
validity to the study (previous studies; established
sources such as ODLIS)
• Exploratory studies are more likely to create their
own conceptual definitions
• Codebook serves as a guide for coders and a record
of the project
• Codebook needs to be refined during the pretesting
• Better to have too many categories than too few
ASSIGNING
ENUMERATIONS TO
VARIABLES
• Nominal – numbers only used for labeling purpose,
they have no true value. Example: type of library
• Ordinal – rank ordered
• Interval – numbers represent distance between
categories within ranking. Example: Years of
experience
• Ratio – always has ‘’0” value . Example: Age
QUALITY CONTROL
•
•
•
•
Validation of coding schema through inter-coder
reliability test
Acceptable inter-coder reliability levels vary
Reliability test is done at pilot stage and the end of the
study and the results of the latter are reported in the
study.
Reliability problem can be addressed by additional
training for coders, revising coding instructions,
combining and separating categories.
Calculating the agreement: nominal scale –
percentage; Cohen’s kappa and Scott’s pi, Pearson’s
correlation are used for scales beyond nominal
REPORTING
FINDINGS
• Reporting in raw numbers, percentages, or
frequencies
• Must directly address research questions
• Format: bars, charts, tables
• Test of statistical significance (Chi-square) =
associations between nominal variables
ANALYSIS OF THE
STUDY (1)
“Libraries and public perceptions: A
comparative analysis of the European press :
Methodological insights” by Anna Galluzzi
(2014)
“The analysis of newspapers has been
figured out as an alternative method to measure
the relevance and the public perception of
libraries”
“The research aims at quantifying and
qualifying the presence of
issues concerning libraries in the
European press over the last years
…in order to answer the following
research questions:
•
which are the most discussed topics
concerning libraries and
• have they changed over the last years?
• are there any significant differences
between the European countries in the
debate about libraries?
• are there any significant differences
between the European newspapers in
the debate about libraries”
“chronological span covered by the research
is five years, from 2008 to 2012. This choice
was made because 2008 is generally
considered the starting point of the
economic crisis which is still deeply
affecting the Western economies and
political scenarios”
“Countries taken into account are the
United Kingdom, France, Spain and Italy,
since they are considered representative of
different areas and cultural traditions in
Europe”
“second selection was made among the
numerous print newspapers published, with
the objective of choosing two titles for each
country according to the following basic
criteria. The two newspapers were picked
among those of national relevance, the
most widespread and the oldest in each
country, avoiding - if possible
- those officially representing political
parties and the radical
ones.
The selected newspapers are the following:
1.The United Kingdom: The Times and The
Guardian
2.France: Le Figaro and Le Monde
3.Spain: El Mundo and El País
4.Italy: Corriere della Sera and La
Repubblica
The keywords used as query parameters in
the full text
search were ”librar*” and ”bibliot*”
The articles retrieved using the abovementioned parameters are
41,611.
After the retrieval of the articles responding
to the query
parameters, the second step was to select the
pertinent ones, i.e.those articles which
concern libraries in a proper sense
The pertinent articles are 3,659.
“After the selection, a text and content
analysis of the articles was carried
out.
Though aware of the many advantages
(speed, completeness, objectivity and
precision) of an automatic processing, the
risk to think that the whole analysis could be
delegated to computer software, instead
of using them to speed up and enhance it,
was given a special credit.
the analysis was carried out manually and
no text analysis software was used, starting
from the firm belief that no software
can replace human reasoning. A certain
degree of subjectivity was
considered somewhat inevitable and
acceptable”
“First of all, each article was identified
with a univocal name and an ExcelTM
worksheet was prepared to host the results
of the coding.
Then, the articles were analyzed and
coded.
At the beginning, the texts were carefully
reviewed and all concepts and ideas were
annotated as they appeared and then
grouped.”
Variables/Coding categories
1.country
2.newspaper title
3.year of publication
4.prevalence or not of libraries as subject
of the article
5.type of library considered: Public,
National, Academic, School,
Special/Specialized, No specification or
more than one type
6.main topic of the article: Mission/Roles,
Conservation/Holdings/Catalogue,
Digitization/Digital libraries, History,
Reading/Marketing,
Politics/Strategy/Management, Library
closures/Budget cuts, Internet/Ebook/Technology, Services/Users,
Staff/Recruitment, New libraries/New
buildings, Acquisitions/Open access,
Buildings/Architecture.
7.the newspaper section where the article is
published: Opinions/Letters/Debates,
Culture/Education, In brief, Cities/
/National news, World/International news,
Market/Economy/Business, Society, Science,
Other
ANALYSIS OF THE
STUDY (2)
“The Role of Online Videos in Research
Communication: A Content Analysis of
YouTube Videos Cited in Academic
Publications” by Kousha, Thelwall, and
Abdoli (2012)
“This article explores the extent to which
YouTube videos are cited in academic
publications and whether there are
significant broad disciplinary differences in
this practice”
Research questions:
“How frequently are YouTube videos cited in
academic publications and has frequency of
use declined at any stage since the birth of
YouTube (2005–2011)?
What types of YouTube videos are
commonly cited in research articles?
Are there significant broad disciplinary
differences in citing online videos??
Data collection
Researches “extracted URL citations to
YouTube videos from academic
publications indexed by Scopus from
2005 to 2011 across four broad
disciplines: the sciences, medicine and
health sciences, social sciences, and arts
and humanities. We then viewed a
sample of the cited videos and classified
their contents using a specially designed
classification scheme”
“viewed 551 randomly sampled cited videos from
research articles (omitting reviews, conference papers, editorials,
letters, and notes) from the Scopus searches. In many cases, we also read
the descriptions of, and some comments on, the YouTube videos (if
available) and searched for a lecturer or speaker biography to better
understand video contexts.
The first and third authors separately conducted an initial
content analysis of the videos based on a primary classification
scheme derived from a previous classification of YouTube videos tweeted by
academics (Thelwall et al., in press). To reach a reasonable degree of
agreement on the classification procedure, the two coders first crosschecked the categorization process for a sample of 80 videos from different
subject areas, discussing the coding of different types of videos.
Examples of the categories they used:
“Demonstration of a natural or formal
science phenomenon: This subclass
includes videos with an apparently
scientific theme such as a real-time lab
experiment in robotics
Natural or formal science documentary:
This subclass includes documentaries
(usually with narration and edited
with different types of shots) about
natural or formal science
Natural or formal science academic
lectures: This group includes natural or
formal science lectures, speeches, and
talks by academics in conferences”
Limitations: “Another practical limitation was the complex and subjective
issue of coding video contents. We discussed the coding
system after the initial classification process and modified it
several times to get general agreement. For instance, we first
merged television shows and news-related videos into one
class, but subsequently split them into two subclasses
because shows are more related to arts and humanities
whereas the news is more associated with the social sciences
(e.g., political science and journalism). Furthermore, some
scientific demonstrations also can be used for academic education,
and, in rare cases, it was difficult to recognize whether
they were created for scientific demonstrations, entertainment,
or teaching.”
ANALYSIS OF THE
STUDY (3)
“An analysis of American academic libraries'
websites: 2000-2010 “ by Noa Aharony (2012)
“It is …interesting to trace the changes and
developments that academic library websites
have undergone over the last ten years, as
expressed through the library websites
themselves”
“research questions are:
Is there a difference between the content
of academic library websites in the year
2000 and in the year 2010?
What are the LIS current trends and
tendencies being expressed through
those academic library websites?”
Conceptual definition:
“According to [23] McGillis and Toms
(2001), a library website reflects its
virtual public face, acting as a front
door to the collections, services, and, to
an extent, its staff”
“The first phase of the investigation
involved choosing academic library
homepages, which appear both on a
current webpage and in the Internet
Archive, to be included in the sample.
These were located by examining the
Association of College and Research
Libraries (ACRL) accredited LIS
schools, numbering 57. A total of 31
academic libraries were selected from
this list based on the following criteria:
The library has a current homepage.
The library homepage appears in the
Internet Archive in the year 2000.
Four out of the 31 libraries were not
found in the Internet Archive in 2000, so
data were collected from the first year
that they appear in the Internet
Archive”
Time frame: “The year 2000 was chosen
because: firstly, while the Internet Archive
began archiving its documents in 1996, most
of the academic library content is found
from the year 2000 onwards; and secondly, a
ten year period was deemed suitable for
tracing the changes, developments, and
trends of the last decade, which contained
many20 March 2014 Page 3 of 11 ProQuest
technological innovations and conceptual
changes in the field of library and
information science”.
She conducted “content analysis of
academic library websites in the two
periods, based on [25] Qutab and
Mahmood's (2009) website content
analysis and modified for the purpose of
the current study. The modified
checklist includes 42 items divided into
eight categories:
-site description
-currency
-website aids and tools
-library general information
-library resources services
-links to e-resources
-value added services.”
“The final percentage of agreement for
all coding decisions was 89 per cent,
which suggests that the coding
classification used was reliable”
CONTENT ANALYSIS OF
THE INTERVIEW
TRANSCRIPT (4)
• Interviews: recorded and transcribed
• Team of 4 coders (2 groups ) will work on assigned
number of interviews
• Print-outs of the interview text need to be read and
the concepts highlighted
• Each group needs to meet and agree on the
highlighted concepts reporting percentage of
agreement
• All four coders will meet and discuss all concepts
and group them into larger categories
ADVANTAGES AND
DRAWBACKS
• Operates directly with text/transcripts of
communication
• Can use both- qualitative and quantitative operations
• Allows research of the historical documents
• Is an unobtrusive , nonreactive research technique
• Not geographically limited
• Time-consuming
• Reveals the content but not the content significance
• Can not make conclusions about motives, meanings, or
effect of the messages
• Some texts (websites) have tight data collection periods
QUESTIONS?
SPECIAL THANKS TO
MORGAN GELBER. MY
TALENTED AND GIFTED
DAUGHTER WHO
NEVER GETS THE
CREDIT SHE
DESERVERS.