LIS 397.1 Introduction to Research in Library and

Download Report

Transcript LIS 397.1 Introduction to Research in Library and

LIS 397.1
Introduction to Research in
Library and Information
Science
Some Other Research Techniques
Relevant to Library and Information Science
R. E. Wyllys
Copyright 2003 by R. E. Wyllys
Last revised 2003 Apr 20
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Lesson Objectives
• To provide you with a brief introduction to
certain research areas and techniques that
are relevant in library and information science
–
–
–
–
–
Analytical Bibliography
Bibliometrics and Cybermetrics
Content Analysis
Historical Research
Stylostatistics
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Analytical Bibliography
• Analytical bibliography is the study of physical
characteristics of books, manuscripts, maps, and
other written materials with the goal of shedding light
on such matters as the authenticity of individual items
and the chronology of different versions of particular
works.
– Though the techniques of analytical bibliography were
largely developed in order to study books and manuscripts of
considerable age (e.g., early printings of Shakespeare's
plays), the techniques are applicable to materials of more
recent origin.
• For example, they have been used to study alleged, and real,
forgeries in the 20th century, such as the "Vinland Map"
purchased by Yale University Library in 1958 for $1 million, and
the "Hitler Diaries" published in 1983.
– An excellent brief survey of some of the techniques is
provided by the Smithsonian Center for Materials Research
and Education as Identifying the Real Thing.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Analytical Bibliography
• Physical characteristics that can be
studied include:
– Paper and ink chemistry
– Watermarks
– Collation
– Binding
– Typefaces, even down to the level of
individual distinctive pieces of type
– Spelling variations
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Bibliometrics and Cybermetrics
• So good an overview of bibliometrics and cybermetrics
has been provided by Dr. Ruth A. Palmquist (currently
Visiting Associate Professor, Graduate School of Library
and Information Science, Dominican University) that I can
do no better than simply to quote her "Bibliometrics"
Webpage* extensively in this and the following six slides
treating this topic.
• First, a definition: "Bibliometrics is a type of research
method used in library and information science. It utilizes
quantitative analysis and statistics to describe patterns of
publication within a given field or body of literature.
Researchers may use bibliometric methods of evaluation
to determine the influence of a single writer, for example,
or to describe the relationship between two or more
writers or works."
* See the slide on "References: Bibliometrics and Cybermetrics" at the end of this presentation.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Bibliometrics and Cybermetrics
• "Laws of Bibliometrics. One of the main areas in bibliometric
research concerns the application of bibliometric laws. The three
most commonly used laws in bibliometrics are: Lotka's law of
scientific productivity, Bradford's law of scatter, and Zipf's law of
word occurrence."
– Lotka's Law. Named after Alfred J. Lotka, this law "describes the
frequency of publication by authors in a given field." It states that in
a given field the number of authors who make n contributions to the
field is approximately 1/n2 of the number who make a single
contribution, and that the typical proportion of those making just
one contribution is about 60% of the authors in the field. "This
means that out of all the authors in a given field, 60 percent will
have just one publication . . . 15 percent will have two publications
(1/22 times .60), 7 percent of authors will have three publications
(1/32 times .60), and so on." It can be shown that only about 6% "of
the authors in a field will produce more than 10 articles" apiece.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Bibliometrics and Cybermetrics
– Bradford's Law. Named after Samuel C. Bradford, a British
librarian, "Bradford's Law serves as a general guideline to librarians
in determining the number of core journals in any given field. It
states that journals in a single field can be divided into three parts,
each containing the same number of articles: 1) a core of journals
on the subject, relatively few in number, that produces
approximately one-third of all the articles, 2) a second zone,
containing the same number of articles as the first, but a greater
number of journals, and 3) a third zone, containing the same
number of articles as the second, but a still greater number of
journals. The mathematical relationship of the number of journals in
the core to the first zone is a constant n and to the second zone the
relationship is n². Bradford expressed this relationship as 1:n:n².
Bradford formulated his law [in 1934] after studying a bibliography
of geophysics. . . . Bradford's Law is not statistically accurate,
strictly speaking. But it is still commonly used as a general rule of
thumb."
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Bibliometrics and Cybermetrics
– Zipf's Law. Named after George K. Zipf, a Harvard professor of
philology, this law describes the distribution of frequencies of words
in ordinary prose:
• Suppose that you have a reasonably lengthy text, that you count the
frequencies of the distinct words in the text, and that you then arrange
the distinct words in decreasing order of frequency. Next, you assign
rank 1 to the first word in the resulting list, i.e., the most frequent word;
rank 2, to the next most frequent word; rank 3, to the third most
frequent word; and so on.*
• Zipf's Law says that the product of the rank of a word in this list
multiplied by its frequency will be approximately constant. That is, r x f
= C, where r is the rank of a word, f is the frequency of the word, and C
is a constant. (C will depend mainly on the size of the particular text
you have counted, but certain other characteristics of the text also help
to determine C.)
– "Zipf's Law . . . is not statistically perfect, but it is very useful for
indexers."
*Note: A Web-based program for counting and ranking the frequencies of the words in a text is available
as the Web Frequency Indexer, created and maintained by Dr. Catherine N. Ball, Department of
Linguistics, Georgetown University.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Bibliometrics and Cybermetrics
• Another important area of bibliometrics, Citation Analysis, "uses
various methods . . . in order to establish relationships between
authors or their work. Here is a definition of citation analysis,
and definitions of co-citation coupling and bibliographic coupling,
which are specific kinds of citation analysis."
– Citation Analysis. "When one author cites another author, a
relationship is established. Citation analysis uses citations in
scholarly works to establish links. Many different links can be
ascertained, such as links between authors, between scholarly
works, between journals, between fields, or even between
countries. Citations both from and to a certain document may be
studied. One very common use of citation analysis is to determine
the impact of a single author on a given field by counting the
number of times the author has been cited by others. One possible
drawback of this approach is that authors may be citing the single
author in a negative context (saying that the author doesn't know
what s/he's talking about, for instance)."
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Bibliometrics and Cybermetrics
– Co-Citation Coupling "is a method used to establish a subject
similarity between two documents. If papers A and B are both cited
by paper C, they may be said to be related to one another, even
though they don't directly cite each other. If papers A and B are
both cited by many other papers, they have a stronger relationship.
The more papers they are cited by, the stronger their relationship
is."
– Bibliographic Coupling "operates on a similar principle, but in a
way it is the mirror image of co-citation coupling. Bibliographic
coupling links two papers that cite the same articles, so that if
papers A and B both cite paper C, they may be said to be related,
even though they don't directly cite each other. The more papers
they both cite, the stronger their relationship is.“
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Bibliometrics and Cybermetrics
• The best known facilitation of citation analysis is that of the
Institute of Scientific Information (ISI), which publishes several
citation indexes to journals in various fields.
– ISI’s Web of Science includes
• Science Citation Expanded
• Social Sciences Citation Index
• Arts & Humanities Citation Index
– For I-School students, access to ISI journals is available through
UT-Austin Library Online (UTLOL), via “Databases and Indexes to
Articles”.
• Web Applications of Bibliometrics
– Cybermetrics. "Recently, a new growth area in bibliometrics has
been in the emerging field of webmetrics, or cybermetrics as it is
often called. Webmetrics can be defined as using of bibliometric
techniques in order to study the relationship of different sites on the
World Wide Web. Such techniques may also be used to map out
(called "scientific mapping" in traditional bibliometric research)
areas of the Web that appear to be most useful or influential, based
on the number of times they are hyperlinked to other Web sites."
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Content Analysis
• According to Bernard Berelson (1952), a pioneer in the field,
"Content analysis is a research technique for the objective,
systematic, and quantitative description of the manifest content
of communication." Berelson points out that content analysis
– Concerns the syntactic and semantic dimensions of language, as
applied to its pragmatic effects (e.g., its effects on the recipients of
communications).
– Must be objective. That is, a content analyst must define his or her
terms and methods sufficiently clearly so that other analysts, using
the first analyst's techniques on the same body of communication,
will reach essentially the same conclusions.
– Must be systematic. That is, all of the relevant content is to be
analyzed; a content analyst is not free to disregard those portions
of the communication under study that might tend to disprove the
point that the analyst is trying to make.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Content Analysis
• An example of content analysis is contained in a popular article
by Joyce Brothers (1972), a well known psychologist. In
discussing President Nixon's press conferences, Brothers says,
– "I think a part of the answer [to why Nixon conducted press
conferences in the way he did] can be found in the studies of
psychological researchers Richard E. Donley and David G. Winter
of Wesleyan University. They analyzed the inaugural addresses of
12 Presidents, from Theodore Roosevelt to Richard Nixon, using
words and verbal images to measure the need for power against
the need for achievement in each man.
– "Desire for power was indicated by speech references to strong
action, aggression, persuasion and argument. Need for
achievement was evidenced by such words as good, better,
excellent, high quality, etc. Most Presidents, like most journalists,
showed higher need for power than for achievement. Theodore
Roosevelt and John F. Kennedy had [the] highest power drives,
with 80 power images per 1000 spoken words.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Content Analysis
– "They were also two of the most highly successful Presidents when
it came to dealing with the press. . . .
– "Donley and Winter found only three Presidents in whom the need
for achievement was greater than the power drive. These were
Herbert Hoover, Lyndon B. Johnson, and Richard Nixon. All of
them have shunned formal press conferences. Since Mr. Nixon, at
least, has shown himself capable of handling such encounters well,
ineptness can't be the whole reason. It is more likely that
achievement-oriented Presidents find power-oriented press
conferences more frustrating than helpful in their efforts."
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Content Analysis
• Another example of content analysis is provided by Susan
Brandehoff (1987), who summarizes research by Leslie
Edmonds by saying:
– "In 'The Treatment of Race in Picture Books for Young Children'
(Book Research Quarterly, Fall, 1986, p. 30-41), Edmonds
compares two samples of picture books published by mainstream
publishers between 1928 and 1974 and between 1980 and 1984,
taking into account the race of major characters and the positive or
negative treatment of various racial groups. . . .
– "In the 1928-1974 grouping, 57% of the books featured major
characters who were white; 27% presented a racial mix of main
characters; 7% were black; 5%, Asian; 2%, Native American; and
2%, Hispanic. Positive traits such as kindness outnumbered
negative traits such as meanness by about five to one for all
groups, Edmonds says, with Native Americans slightly above norm
for positive traits, and blacks and Asian groups somewhat below
norm.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Content Analysis
– "White characters were portrayed as slightly smarter than the norm;
blacks as more musical; and Hispanics as more religious. Native
Americans were strong and brave 'in undue proportion to the total
sample', Edmonds observes. . . .
– "Asian groups probably fared worst among all minorities, Edmonds
says, with little distinction made between Chinese and Japanese
characters, inaccurate presentation of Asian culture, and no strong
Asian personality traits offered. Overall, minorities were not
presented with the same 'variety, humor, dignity, and skill depicted
in the white racial majority.'
– "In the later sample (1980-84), Edmonds found that books about
Native Americans were being published at about the same level,
but there were fewer books being published about other groups.
Blacks are still presented most frequently, but with more variety and
less stereotyping. There are not yet strong images of Asian
characters or cultures other than Chinese and Japanese, Edmonds
says, and Hispanics 'continue to get very meager coverage'."
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Historical Research
• Historical studies are an important part of research in library and
information science. However, the techniques of historical
research, as opposed to the other techniques discussed in this
presentation, are usually familiar, at least in general terms, to all
educated people. Furthermore, the areas of applicability of
historical research are clearly enormous.
• Hence, it seems appropriate here simply to mention certain
guiding principles of historical research:
– Reliance on original sources to the maximum extent possible
– Recognition of the contemporary contexts of those original sources
– Seeking sources on both, or all, sides in conflicts and
disagreements so as to bring out all the relevant viewpoints
– Avoidance of anachronism, i.e., avoidance of attributing to people
in the past the kind of perspectives and attitudes that we have
today. People in the past should be judged in terms of what they
knew at the time they acted.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Stylostatistics
• Stylostatistics is the study of the statistics of written
works, usually for the purpose of identifying the true
author of the work.
• Probably the best known example of stylostatistical
analysis is Inference and Disputed Authorship: The
Federalist, by Frederick Mosteller and David L.
Wallace.
– In this book the authors, both professors of statistics in the
University of Chicago, examined the 12 "Federalist Papers"
whose authorship had been in dispute among historians.
• The "Federalist Papers" are short essays, published in 178788, favoring the adoption of the (then) proposed Constitution of
the United States. Though the papers were published
pseudonymously under the name "Publius", Alexander
Hamilton, John Jay, and James Madison later acknowledged
their authorship of some 73 of the 85 papers. However, the
authorship of the other 12 papers remained in doubt.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Stylostatistics
– Using several stylostatistical techniques and
employing inferential statistical analysis, Mosteller
and Wallace were able to conclude, "In summary,
we can say with better foundation than ever before
that Madison was the author of the 12 disputed
papers." Historians have generally accepted the
Mosteller and Wallace findings as conclusive.
– The Mosteller and Wallace book provides a brief
history of the Federalist Papers plus a masterful
explanation of stylostatistical techniques and their
application to the problem of uncovering the
authorship of the disputed papers.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
Research Can Be Fun!
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
References
• Analytical Bibliography
– Altick, Richard D. The Art of Literary Research. 4th ed. New
York, NY: W.W. Norton & Company; 1993. ISBN: 0-39396240-7. [This and the following book are exceptionally
readable. In these two books, Altick turns the story of literary
research, which relies heavily on the techniques of analytical
bibliography, into literary detective stories.]
– Altick, Richard D. The Scholar Adventurers. Columbus, OH:
Ohio State Univ Press; 1987. ISBN: 0-814-20435-X.
– Harris, Robert. Selling Hitler: The Extraordinary Story of the
Con Job of the Century--The Faking of the Hitler "Diaries".
New York, NY: Pantheon; 1986. ISBN:0-394-55336-5.
– Tanselle, G. Thomas. Literature and artifacts. Charlottesville,
VA: Bibliographical Society of the University of Virginia; 1998.
ISBN:1-88-363-106-8.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
References
• Bibliometrics and Cybermetrics
– Bradford, S. C. Documentation. Washington, DC; Public
Affairs Press; 1950.
– Palmquist, Ruth A. Bibliometrics. Retrieved 2003 April 12 from
http://www.gslis.utexas.edu/~palmquis/courses/biblio.html
– Rousseau, Ronald. Time table of bibliometrics. Retrieved 2003
April 12 from
http://users.pandora.be/ronald.rousseau/html/time_table_of_bibliometrics.html
[This time table provides an interesting overview of bibliometrics.]
Zipf, George K. Human Behavior and the Principle of Least Effort:
An Introduction to Human Ecology. Cambridge, MA: AddisonWesley; 1949.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
References
• Content Analysis
– Berelson, Bernard. Content Analysis in Communication
Research. Glencoe, IL: Free Press; 1952.
– Brandehoff, Susan. Picturebooks get Low Grades for
Ethnic/Racial Imagery. American Libraries. April 1987. P.
298.
– Brothers, Joyce. The President and the Press. TV Guide.
September 23, 1972. Pp. 6-12.
– Holsti, Ole R. Content Analysis for the Social Sciences and
Humanities. Reading, MA: Addison-Wesley; 1969.
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
References
• Historical Research
– Gray, Wood. Historian's Handbook: A Key to the Study and
Writing of History. Boston, MA: Houghton Mifflin; 1959.
– Nevins, Allan. The Gateway to History. Garden City, NY:
Anchor; 1962.
– Tey, Josephine. The Daughter of Time. New York, NY:
Macmillan; 1952. [Written by a master detective story author,
this fascinating novel describes the pursuit, by a bedridden
researcher, of an unbiased, objective understanding of King
Richard III despite the obscuring fog of writings by Richard's
victorious opponents and their successors, including
Shakespeare. The result is a delightful story that is also a
primer on historical research.]
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science
References
• Stylostatistics
– Morton, Andrew Q.; McLeman, James. Christianity in the Computer
Age. New York, NY: Harper and Row; 1965. [An early stylostatistical
publication, this book examines the "Pauline Epistles", the 13 books of
the New Testament that have traditionally been attributed to St. Paul.
The authors conclude that only 4 of the 13 epistles were, beyond a
reasonable doubt, written by Paul.]
– Morton, Andrew Q. Literary Detection : How to Prove Authorship and
Fraud in Literature and Documents. New York, NY: Scribner; 1978.
ISBN:0-684-15516-8.
– Mosteller, Frederick; Wallace, David L. Inference and Disputed
Authorship: The Federalist. Reading, MA: Addison-Wesley; 1964.
– Yule, G. Udny. The Statistical Study of Literary Vocabulary.
Cambridge, UK: Cambridge University Press; 1944. [The pioneering
work on stylostatistics.]
School of Information - The University of Texas at Austin
LIS 397.1, Introduction to Research in Library and Information Science