Trust and Epistemic Communities in Biodiversity Data Sharing

Download Report

Transcript Trust and Epistemic Communities in Biodiversity Data Sharing

Trust and Epistemic Communities
in Biodiversity Data Sharing
Nancy Van House
SIMS, UC Berkeley
www.sims.berkeley.edu/~vanhouse
Trust and Epistemic Communities
in Biodiversity Data Sharing
 DLs: ready access to unpublished information by
variety of users - crossing sociotechnical
boundaries
 Raises issues of trust and credibility
 Knowledge is social
 What we know, whom we believe is determined
by/within epistemic cultures
 Biodiversity data
 Great variety of information, sources, purposes
 CalFlora: an example of a user-oriented DL
 Incorporating users’ practices of trust and
credibility
 Negotiating differences x epistemic cultures
DLs Facilitate Access
 To greater variety of information:
 Unpublished (unreviewed) information
 “Raw” data such as reports of observations
 Information from outside own reference
group
 Problems:
 Which info, sources do we believe?
 How do we evaluate info from unfamiliar sources?
 Which info do we use for what purposes?
 By people from outside own reference group
 Inappropriate use of information?
 Burden on data owner of making data available,
usable, and understandable to reduce misuse
Examples of Risks
– Botanical Information
 Unreliable Info
 Erroneous, duplicative observations >> belief that
a species is prevalent >> not preserving a
population of a rare species
 Chasing after erroneous reported sighting of a rare
species –or discounting significant sighting as
amateur’s error
 Inappropriate Use of Info
 Private landowners destroying specimens of a rare
plant to avoid legal limits on land development
 Collectors (over-)collecting specimens of rare
species
Knowledge is Social
 What we know comes primarily from others.
 Cognitive efficiency: we don’t have time, resources
 Expertise: we don’t have sufficient knowledge in all
areas
 Have to decide whom we trust, what we
believe.
 What we consider “good“ work, whom we
believe and, how we decide are determined and
learned in epistemic communities
 DLs need to support the diverse practices of
epistemic communities
Social Nature of Knowledge is of
Concern in Many Areas
 Science studies
 Inquires into the construction of scientific knowledge
& authority
 Social epistemology
 Asks: How should the collective pursuit of knowledge
be organized?
 Situated action/learning
 Posits knowledge, action, identity, and community to
be mutually constituted
 Knowledge management
 Is concerned with how to share knowledge
Cognitive Trust and DLs
 For people to use a DL:
 Information must be credible
 Sources must be trustworthy
 DL itself must be perceived to be trustworthy
 How can DLs be designed to:
 Facilitate users’ assessments of trust and
credibility of info and sources?
 Demonstrate their own trustworthiness?
Epistemic Cultures
 “…those amalgams of arrangements and
mechanisms … which, in a given field,
make up how we know what we know.”
 “Epistemic cultures…create and warrant
knowledge, and the premier knowledge
institution throughout the world is, still,
science.”
Karen Knorr-Cetina, Epistemic Cultures
Culture
 Context of history and on-going events
 Practice: how people actually do their doto-day work
 Artifacts
 Info artifacts include documents, images,
thesauri, classification systems
 Diversity
 If all the same, no culture
 Including diversity x areas of science
Epistemic Cultures Differ
 Practices of work
 Practices of trust
Artifacts – e.g. genres
Methods of data collection and analysis
Meanings, interpretations, understandings
Tacit knowledge and understandings
Values
Methods, standards, and information for
evaluating other participants’ work and values
 Institutional arrangements






Communities and Knowledge
 Becoming a member of a community of
practice = identity
 learning practices, values, orientation to the
world
 We learn what to believe, whom to
believe, how to decide in epistemic
communities.
 We tend to trust people from within our
own epistemic communities.
 Similar values, orientation, practices,
standards
 Ability to assess their credibility
DLs and Epistemic Cultures
 DLs enable information to cross epistemic
communities.
 More easily, more often than before.
 Raw data, not just syntheses, analyses – e.g.
publications
 Crossing communities often undermines our
practices of trust.




Who are these people?
How did they collect the data?
What do they know?
What are their goals, values, priorities?
 DLs need to be designed to support
practices of assessing trustworthiness and
credibility.
Biodiversity Data
 Biodiversity: studies diversity of life and
ecosystems that maintain it
 Central question: change over space and
time
 Uses large quantities of data that vary in:
 Precision and accuracy
 Methods of data collection, description, storage
 Old data particularly valuable
 Broad range of datasets: biological,
geographical, meteorological, geological…
 Created and used by different professions,
disciplines, types of institutions…for different
purposes
 Politically, economically, sensitive data
“Citizen Science”
 Fine-grained data from observers in the
field
 Observers with varying levels and types of
expertise
 E.g., expert on an area, habitat, taxon…
 Expert amateurs
 Private-public cooperation
 Government agencies, environmental action
groups, university herbaria, membership
organizations, concerned individuals…
CalFlora
http://www.calflora.org
 Comprehensive web-accessible
database of plant distribution
information for California
 Independent non-profit organization
 Designed/managed by people from
botanical community, not librarians
or technologists
 Free
 In conjunction with UC Berkeley Digital
Library (http://elib.cs.berkeley.edu)
CalFlora Target Users
 Researchers & prof’ls in land management
 Ready access to data for
 Addressing critical issues in plant biodiversity
 Analyzing consequences of land use alternatives
and environmental change on distribution of native
and exotic species
 The public: promoting interest in
biodiversity
 Active engagement in biodiversity issues/work
 Wildflowers as “charismatic”
CalFlora Priorities
 Focus on people; put technology in the
back seat
 Pay attention to how the world works for
the people who produce and use
information
 Honor existing traditions of data exchange
Botanists at Work
Components of Interest Today
 CalPhotos
 CalFlora Occurrence Database
CalPhotos
 In conjunction with the UC Berkeley Digital
Library Project http://elib.cs.berkeley.edu
 > 28,000 images of California plants
 Approx. half of all Calif. species are represented
 Sources
 Some institutions – e.g. Cal Academy of Sciences
 Many from “native plant enthusiasts”
 Currently accepting/soliciting contributions from users
 Major reported uses
 Plant identification
 Illustrations
CalFlora Occurrence
Database
 > 800,000 geo-referenced reports of
observations
 Specimens in collections
 Reports from literature
 Reports from field
 Checklists
 Sources
 19 institutions
 About to begin accepting reports from registered
contributors via Internet
CalFlora Occurrence Database
 Users can
 “Click through the map to underlying data”
 Download data for own analyses, tools
 Uses
 Land management decisions
 Legally-mandated environmental reports (NEPA,
CEQA)
 Identify plants (though not designed for this)
 Common analyses
 Which species are present in an area
 Which are common, which are rare
 Which species are restricted to a habitat affected by
proposed actions
 Analyze various species in combination, by geo
CalFlora Occurrence
Database: Significance
 Most comprehensive source by far (for Calif)
 Common as well as rare taxa
 Biodiversity beginning to be interested in all
populations, not just rare -- requires vastly more data
 Data downloadable, manipulable
 Easy to use (for professionals, anyway)
 Remote access via Internet
 E.g. botanist in remote National Forest…
 About to accept observations from “the public”
 Source of valuable data re rare and esp’ly common
species
Dilemmas and Conflicts
 Useful place to see tensions, breakdowns,
conflicts across epistemic cultures
 Not whose right, wrong but underlying
differences in values, priorities, practices,
understandings
CalFlora Dilemmas
 Quality filtering: made centrally vs. pushed down
to user
 Inclusiveness of observations vs. selectivity
 Speed of additions vs. review, filtering
 Labelling data for quality vs. providing info for users
 Access
 Benefits vs. dangers of wide access to information
 Free vs. fee
 Cost recovery
 Discourage frivolous use
 Who bears the costs?
 Externalities
Dilemmas, Cont.
 Institutional independence:
 Autonomy, ability to be responsive to multiple
stake-holder communities vs. security and
credibility of institutional sponsorship
How (Some) Experts Assess
Occurrence Reports
 The evidence:
 Type of report (specimen, field observation,
list)
 Type of search (casual, directed)
 The source:
 Personal knowledge of contributor’s expertise
 Examination of other contributions, same
person
 Annotations by trusted others
 Ancillary conditions:
 Likelihood of that species appearing at that
time, habitat, geographical location
How CalFlora Presents
Occurrence Data
 Links to data source(s) – personal and
institutional
 Compliance with institutional source’s
requirements
 Fuzzed locations
 Links to institutional source’s caveats, explanations
 Publicly-contributed observations
 Info about observer
 Info about observation
 Annotations by experts
Contributor Registration





Biography, credentials (free text)
Expertise/interests (free text)
Affiliation
Contact info/web site
“I will submit only my own observations of wild
plants. I realize that this system is only for firsthand reports about plants, native and
introduced, that are growing without deliberate
planting or cultivation.”
 “I will…make sure I have the correct scientific
name…I will submit uncertain identifications only
if I believe them to be very important and time
sensitive, and will label such reports ‘uncertain.’”
Contributor Registration (cont)
 Experience level (self-assessment; check one)
 I am a professional biologist/botanist, or have
professional training in botany.
 Although I do not have formal credentials, I am
recognized as a peer by professional botanists.
 Although I do not consider myself to have
professional-level knowledge, I am quite experienced
in the use of keys and descriptions, and/or have
expertise with the plants for which I will be submitting
observations.
 I do not have extensive experience or background in
botany, but I am confident that I can accurately
identify the plants for which I will be submitting
observations.
Occurrence Form
 Species identification, habitat, location, date
 Method of identification
 “I recognize …from prior determinations and
experience”
 “I compared this plant with herbarium specimens”
 “I keyed this plant in a botanical reference”
 “I compared … with published taxonomic
descriptions”
 “An expert reviewed and confirmed this identification”
 Certainty of identification
 “I am confident of this identification, and submit this
as a positive observation.”
 “I am not certain of this identification but believe it to
be a significant observation and submit it here as an
alert only.”
Annotations
 Herbarium practice: experts annotate
records with corrections, comments.
 CalFlora: registered experts can annotate
photos and occurrence records.
 Annotation by an expert raises the credibility
of a record.
 Actually – how often?
CalFlora Data and Trust
 Trusting data





Every observation trackable to source(s)
Detailed info & contact info for source, observer
Detailed info about observation
Observations categorized by type
Annotation
 Trusting users
 NOT registering or charging users
 Respecting source’s limits, caveats on data
 Leaving quality decisions to the users
 Trusting CalFlora
 Detailed list of contributing organizations, advisors
 NOT affiliated with another organization
Concerns
 CalFlora relies on record-by-record examination
 Looking at methods of classifying records in
collections
 CalFlora relies on voluntary contributions of data
 Experts with lots of data and no time to contribute
 Well-meaning volunteers with time but not expertise
 Users need to be able to track back to source of each
record, each data point
 Concern about “modalities,” uncertainties being lost
 Archiving
 Concern about dynamicism of CalFlora
 Stability of electronic media
 Stability of the organization
 Delegating decisions about quality of observations to
(inexpert) users
Implications for DLs, Other Info
Systems
 The social nature of knowledge
 We have to decide on whom we will depend
 We learn from others whom and what we can depend
on
 Information must be credible to be used
 The importance of culture in constituting
knowledge
 Practice, values, orientations…
 Epistemic cultures differ
 Not simply a matter of experts vs. public
Therefore:
 DLs need to accommodate practices
 Incl. practices of trust and credibility
 Users need to know provenance of data
 Users differ
 and not just experts vs. nonexperts
 DLs serve multiple, varied epistemic cultures
 Same person,multi cultures
 Users need flexibility to accommodate the DL to
their needs, practices
 Some users need decisions made for them
 >> involvement of users in design
Implications for
DL Creation and Management
 Different epistemic cultures participate in the
design and management of DLs, as well
 Librarians
 Technologists
 Various, differing user groups
 Differences in practices, understandings, values
>> differences in priorities and decisions
 A continual process of negotiation and
translation
`