Boolean, bibliometrics, and beyond Part 2 LIS 670 donna Bair-Mundy Bibliometrics Bibliometrics – a defintion Using quantitative analysis and statistics to examine patterns in academic publishing, now including.

Download Report

Transcript Boolean, bibliometrics, and beyond Part 2 LIS 670 donna Bair-Mundy Bibliometrics Bibliometrics – a defintion Using quantitative analysis and statistics to examine patterns in academic publishing, now including.

Boolean, bibliometrics,
and beyond
Part 2
LIS 670
donna Bair-Mundy
Bibliometrics
Bibliometrics – a defintion
Using quantitative analysis and
statistics to examine patterns in
academic publishing, now
including information transmitted
via the World Wide Web
Bibliometrics – what it looks at
• Author productivity
• Citation analysis – impact factors,
indexing
• Obsolescence of information
resources – half-life of articles
• Dispersion of articles in certain fields
• Word frequencies
Bibliometrics – Purposes (1)
Provide evolutionary models
of science, technology, and
scholarship
Invisible colleges
Physics

Astrophysics
Biophysics
Subatomic
particle
physics
Structure of scholarly disciplines
Evolution of a discipline over
time
Evolution of concepts
Global
warming
Bibliometrics – Purposes (2)
Assist development of
information retrieval
methodologies
Provide tools for studying
information use and impact
Assist in selection and
deselection of resources
Properties of scientific
literature
Fragmentary - each paper
contributes a small piece to the
puzzle under study
Derivative - scientific papers rely
heavily on previous research
(acknowledged in citations)
Edited - peer reviewed by
anonymous referees
Evolution of a discipline
Cole and Eales - 1917 - The history of
comparative anatomy—a statistical analysis of
the literature
• Purpose: "to reduce to geometric form
the activities of the corporate body of
anatomical research, and the relative
importances from time to time of each
country and division of the subject"
• Looked at 6,436 publications dealing
with animal anatomy for the period
1543 to 1860
Published in: Sci. Progr. 11:578-596.
Evolution of a discipline
Cole and Eales - 1917 - The history of comparative
anatomy—a statistical analysis of the literature
• When were the periods of greater or
less importance;
• Where were the centers of activity at
any given time?
• As the field grew, how and when did it
begin to be subdivided into narrower
fields?
Looking at publications
within a field to tell us
about the field itself
Evolution of a discipline: IS
Harmon, Glynn - 1971 – On the evolution of
information science. JASIS 22(4):235-241
• Emergence and development of
information science
• Relationships and roles of
information science within
potentially emergent suprasystem
of knowledge
Science, politics, and economics
E. Wyndham Hulme 1923 - Statistical
bibliography in relation to the growth of
modern civilization
First to use the term "statistical
bibliography"
Purpose: "to ascertain and illustrate
by bibliographical data, various
stages in the development of the
mechanics of civilization"
Published by Butler and Tanner Grafton (London)
Hulme (cont’d)
Used 13 annual issues of The
International Catalogue of Scientific
Literature, from 1901 to 1913
Counted author entries for various
subjects
Tabulated number of indexed journals
by countries (which countries are
highly productive in science?)
Hulme (cont’d)
Felt that subject division in a
discipline was a sign of growth
Concluded that scientific publication
output is influenced by population
change and political and economic
movements
Research output by countries
J. Martin van Zyl 2013 – The generalized Pareto
distribution fitted to research ouoputs of countries
Scientometrics 94(3):1099-1109
Which continent
(besides
Antarctica) is
not
represented?
Why might that
be?
Why might be
the
consequences?
Cost of research
Consequences
ebola
722 results
ebolavirus
984 results
aids
122,722 results
hiv
196,414 results
Author productivity
Alfred J. Lotka 1926 - Statistics—the
frequency distribution of scientific
productivity
Purpose: to "determine, if possible,
the part which men of different calibre
contribute to the progress of science"
Looked at Chemical Abstracts Index,
then Geschichtstafeln der Physik
Published in: J. Washington Acad. Sci. 16:317-325.
Lotka's Law
The total number of authors y in
a given subject, each producing
x publications, is inversely
proportional to some
exponential function n of x.
Lotka's Law - scientific publications
Inverse square law of scientific productivity
Where:
x = number of publications
y = number of authors credited with x
publications
n = constant (equals 2 for scientific
subjects)
C = constant
xn • y = C
No. of authors
Lotka's Law - scientific publications
1 publ.
2 publ.
3 publ.
xn • y = C
4 publ.
Relative impacts of journals
Gross & Gross - 1927 - College libraries and
chemical education
Purpose: Select appropriate journals
for a chemical library to provide good
Which journals to
education for students
collect?
Tabulated 3,633 citations found in the
1926 volume of the Journal of the
American Chemical Society
First use of citation analysis rather
than publication counts
Published in: Science 66:385-389
Relative impacts of journals
Journal Citation Reports
“JCR is still the only usable tool to rank
thousands of scholarly and
professional journals...”
PETER JACSO
Relative impacts of journals
Journal Citation Reports
Relative impacts of journals
Journal Citation Reports
Relative impacts of journals
Journal Citation Reports
Relative impacts of journals
Journal Citation Reports
Citation Indexing
Eugene Garfield 1955 - Citation indexes for
science: a new dimension in documentation
through association of ideas
Impact factor
Influence of an article based on
citations to it
Science Citation Index
Published in: Science 122:108-111.
Problems of indexing
The interrelationship
between the
chemistry and the
biological organisms
of the soils of
Cambodia.
1955
The soil ecology of
Kampuchea
1995
Citation matrix
citing
article
cited
article
cited
article
cited
article
article
citing
article
citing
article
citing
article
citing
article
citing
article
citing
article
ISI Web of Science (1)
ISI Web of Science (2)
ISI Web of Science (3)
ISI Web of Science (4)
ISI Web of Science (5)
Science Citation Index
Association-of-ideas index
citing
article
cited
article
cited
article
cited
article
article
citing
article
citing
article
http://libweb.hawaii.edu/uhmlib/databases/er_title.html#WEB
citing
article
citing
article
citing
article
citing
article
Co-citation analysis
Articles that cite the same article are likely to both
be of interest to the reader of the cited article
citing
article
article
citing
article
These two
articles are
likely to be
related
Selecting productive journals
Samuel Clement Bradford 1934 - Sources
of information on specific subjects
Purpose: to develop a means by
which librarians could select the
most usable periodicals
First paper published on
observations of scattering
Bradford's Law
Published in: Engineering 137:85-86
Bradford's Law of Scattering (1)
"If scientific journals are arranged in
order of decreasing productivity of
articles on a given subject, they may be
divided into a nucleus of periodicals
more particularly devoted to the subject
and several groups or zones containing
the same number of articles as the
nucleus, when the numbers of
periodicals in the nucleus and
succeeding zones will be as a : n : n2 : n3
…"
Bradford's Law of Scattering (2)
No. of
No. of articles
per source
source journals
60
1
3 2
35
30
1
25
2
9 2
9
8
4
6
10
5
7
27 5
4
3
5
Total no. of
articles
60
130
70
30
50
18 130
32
60
35
130
20
15
Bradford's Law of Scattering (3)
3 sources
130 articles
9 sources
130 articles
27 sources
130 articles
George Kingsley Zipf 1935
The psycho-biology of language: an
introduction to dynamic philology
Frequency distributions of words
Two laws
Less frequently occurring
words
Frequently occurring words
Published by MIT Press
Zipf's Law of High
Frequency Words
Proposed in 1949 by George Kingsley Zipf
Where:
r = rank (in terms of frequency)
f = frequency (no. of times the given word
is used in the text)
c = constant for the given text
r•f=c
For a given text the rank of a word multiplied
by the frequency is a constant.
Application of Zipf's laws
William Goffman - automatic indexing
Determine transition point between
high- and low-frequency words
Collect equal number of words above
and below the transition point
Eliminate trivial words using stop list
Remaining content-bearing words
indicate document contents
Obsolescence of resources
Charles F. Gosnell 1944 - Obsolescence of
books in college libraries
Purpose: "to discover lines of
trend or curves of distribution by
means of which this rate of
obsolescence may be expressed in
mathematical form"
Published in: College Res. Libr. 5:115-125
Curve of obsolescence
Age at time of use
Alan Pritchard 1969
Statistical bibliography or bibliometrics?
Coined the term "bibliometrics"
"the application of mathematics
and statistical methods to books
and other media of
communication"
Published in: Journal of Documentation
25(4):348-349
Google indexing criteria
Text within page being indexed to
determine topic
Links to page being indexed
Anchor text of links to page being
indexed (indication of topic)
Weight links to page being indexed
by links to the linking pages
“For a good explanation of Bradford’s
Law of Scattering see...”
Google
Treating links as citations to compute PageRank
high-weight
linkage
low-weight
linkage
Citation tree rings represent the citation history of an article. The color of a citation ring
denotes the time of corresponding citations. The thickness of a ring is proportional to
the number of citations in a given time slice. Chen, C. 2006. CiteSpace II: detecting and visualizing
emerging trends and transient patterns in scientific literature. Journal of the American Society for Information
Science and Technology 57(3):359-3787.
Bibliometrics in Action
A time-zone view of mass-extinction research. Chen, C.
2006. CiteSpace II: detecting and
visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for
Adding bibliometric visualizations to digital
library search results
Adding bibliometric visualizations to
digital library search results