Trust and Epistemic Communities in Biodiversity Data Sharing
Download
Report
Transcript Trust and Epistemic Communities in Biodiversity Data Sharing
Trust and Epistemic Communities
in Biodiversity Data Sharing
Nancy Van House
SIMS, UC Berkeley
www.sims.berkeley.edu/~vanhouse
Trust and Epistemic Communities
in Biodiversity Data Sharing
DLs: ready access to unpublished information by
variety of users - crossing sociotechnical
boundaries
Raises issues of trust and credibility
Knowledge is social
What we know, whom we believe is determined
by/within epistemic cultures
Biodiversity data
Great variety of information, sources, purposes
CalFlora: an example of a user-oriented DL
Incorporating users’ practices of trust and
credibility
Negotiating differences x epistemic cultures
DLs Facilitate Access
To greater variety of information:
Unpublished (unreviewed) information
“Raw” data such as reports of observations
Information from outside own reference
group
Problems:
Which info, sources do we believe?
How do we evaluate info from unfamiliar sources?
Which info do we use for what purposes?
By people from outside own reference group
Inappropriate use of information?
Burden on data owner of making data available,
usable, and understandable to reduce misuse
Examples of Risks
– Botanical Information
Unreliable Info
Erroneous, duplicative observations >> belief that
a species is prevalent >> not preserving a
population of a rare species
Chasing after erroneous reported sighting of a rare
species –or discounting significant sighting as
amateur’s error
Inappropriate Use of Info
Private landowners destroying specimens of a rare
plant to avoid legal limits on land development
Collectors (over-)collecting specimens of rare
species
Knowledge is Social
What we know comes primarily from others.
Cognitive efficiency: we don’t have time, resources
Expertise: we don’t have sufficient knowledge in all
areas
Have to decide whom we trust, what we
believe.
What we consider “good“ work, whom we
believe and, how we decide are determined and
learned in epistemic communities
DLs need to support the diverse practices of
epistemic communities
Social Nature of Knowledge is of
Concern in Many Areas
Science studies
Inquires into the construction of scientific knowledge
& authority
Social epistemology
Asks: How should the collective pursuit of knowledge
be organized?
Situated action/learning
Posits knowledge, action, identity, and community to
be mutually constituted
Knowledge management
Is concerned with how to share knowledge
Cognitive Trust and DLs
For people to use a DL:
Information must be credible
Sources must be trustworthy
DL itself must be perceived to be trustworthy
How can DLs be designed to:
Facilitate users’ assessments of trust and
credibility of info and sources?
Demonstrate their own trustworthiness?
Epistemic Cultures
“…those amalgams of arrangements and
mechanisms … which, in a given field,
make up how we know what we know.”
“Epistemic cultures…create and warrant
knowledge, and the premier knowledge
institution throughout the world is, still,
science.”
Karen Knorr-Cetina, Epistemic Cultures
Culture
Context of history and on-going events
Practice: how people actually do their doto-day work
Artifacts
Info artifacts include documents, images,
thesauri, classification systems
Diversity
If all the same, no culture
Including diversity x areas of science
Epistemic Cultures Differ
Practices of work
Practices of trust
Artifacts – e.g. genres
Methods of data collection and analysis
Meanings, interpretations, understandings
Tacit knowledge and understandings
Values
Methods, standards, and information for
evaluating other participants’ work and values
Institutional arrangements
Communities and Knowledge
Becoming a member of a community of
practice = identity
learning practices, values, orientation to the
world
We learn what to believe, whom to
believe, how to decide in epistemic
communities.
We tend to trust people from within our
own epistemic communities.
Similar values, orientation, practices,
standards
Ability to assess their credibility
DLs and Epistemic Cultures
DLs enable information to cross epistemic
communities.
More easily, more often than before.
Raw data, not just syntheses, analyses – e.g.
publications
Crossing communities often undermines our
practices of trust.
Who are these people?
How did they collect the data?
What do they know?
What are their goals, values, priorities?
DLs need to be designed to support
practices of assessing trustworthiness and
credibility.
Biodiversity Data
Biodiversity: studies diversity of life and
ecosystems that maintain it
Central question: change over space and
time
Uses large quantities of data that vary in:
Precision and accuracy
Methods of data collection, description, storage
Old data particularly valuable
Broad range of datasets: biological,
geographical, meteorological, geological…
Created and used by different professions,
disciplines, types of institutions…for different
purposes
Politically, economically, sensitive data
“Citizen Science”
Fine-grained data from observers in the
field
Observers with varying levels and types of
expertise
E.g., expert on an area, habitat, taxon…
Expert amateurs
Private-public cooperation
Government agencies, environmental action
groups, university herbaria, membership
organizations, concerned individuals…
CalFlora
http://www.calflora.org
Comprehensive web-accessible
database of plant distribution
information for California
Independent non-profit organization
Designed/managed by people from
botanical community, not librarians
or technologists
Free
In conjunction with UC Berkeley Digital
Library (http://elib.cs.berkeley.edu)
CalFlora Target Users
Researchers & prof’ls in land management
Ready access to data for
Addressing critical issues in plant biodiversity
Analyzing consequences of land use alternatives
and environmental change on distribution of native
and exotic species
The public: promoting interest in
biodiversity
Active engagement in biodiversity issues/work
Wildflowers as “charismatic”
CalFlora Priorities
Focus on people; put technology in the
back seat
Pay attention to how the world works for
the people who produce and use
information
Honor existing traditions of data exchange
Botanists at Work
Components of Interest Today
CalPhotos
CalFlora Occurrence Database
CalPhotos
In conjunction with the UC Berkeley Digital
Library Project http://elib.cs.berkeley.edu
> 28,000 images of California plants
Approx. half of all Calif. species are represented
Sources
Some institutions – e.g. Cal Academy of Sciences
Many from “native plant enthusiasts”
Currently accepting/soliciting contributions from users
Major reported uses
Plant identification
Illustrations
CalFlora Occurrence
Database
> 800,000 geo-referenced reports of
observations
Specimens in collections
Reports from literature
Reports from field
Checklists
Sources
19 institutions
About to begin accepting reports from registered
contributors via Internet
CalFlora Occurrence Database
Users can
“Click through the map to underlying data”
Download data for own analyses, tools
Uses
Land management decisions
Legally-mandated environmental reports (NEPA,
CEQA)
Identify plants (though not designed for this)
Common analyses
Which species are present in an area
Which are common, which are rare
Which species are restricted to a habitat affected by
proposed actions
Analyze various species in combination, by geo
CalFlora Occurrence
Database: Significance
Most comprehensive source by far (for Calif)
Common as well as rare taxa
Biodiversity beginning to be interested in all
populations, not just rare -- requires vastly more data
Data downloadable, manipulable
Easy to use (for professionals, anyway)
Remote access via Internet
E.g. botanist in remote National Forest…
About to accept observations from “the public”
Source of valuable data re rare and esp’ly common
species
Dilemmas and Conflicts
Useful place to see tensions, breakdowns,
conflicts across epistemic cultures
Not whose right, wrong but underlying
differences in values, priorities, practices,
understandings
CalFlora Dilemmas
Quality filtering: made centrally vs. pushed down
to user
Inclusiveness of observations vs. selectivity
Speed of additions vs. review, filtering
Labelling data for quality vs. providing info for users
Access
Benefits vs. dangers of wide access to information
Free vs. fee
Cost recovery
Discourage frivolous use
Who bears the costs?
Externalities
Dilemmas, Cont.
Institutional independence:
Autonomy, ability to be responsive to multiple
stake-holder communities vs. security and
credibility of institutional sponsorship
How (Some) Experts Assess
Occurrence Reports
The evidence:
Type of report (specimen, field observation,
list)
Type of search (casual, directed)
The source:
Personal knowledge of contributor’s expertise
Examination of other contributions, same
person
Annotations by trusted others
Ancillary conditions:
Likelihood of that species appearing at that
time, habitat, geographical location
How CalFlora Presents
Occurrence Data
Links to data source(s) – personal and
institutional
Compliance with institutional source’s
requirements
Fuzzed locations
Links to institutional source’s caveats, explanations
Publicly-contributed observations
Info about observer
Info about observation
Annotations by experts
Contributor Registration
Biography, credentials (free text)
Expertise/interests (free text)
Affiliation
Contact info/web site
“I will submit only my own observations of wild
plants. I realize that this system is only for firsthand reports about plants, native and
introduced, that are growing without deliberate
planting or cultivation.”
“I will…make sure I have the correct scientific
name…I will submit uncertain identifications only
if I believe them to be very important and time
sensitive, and will label such reports ‘uncertain.’”
Contributor Registration (cont)
Experience level (self-assessment; check one)
I am a professional biologist/botanist, or have
professional training in botany.
Although I do not have formal credentials, I am
recognized as a peer by professional botanists.
Although I do not consider myself to have
professional-level knowledge, I am quite experienced
in the use of keys and descriptions, and/or have
expertise with the plants for which I will be submitting
observations.
I do not have extensive experience or background in
botany, but I am confident that I can accurately
identify the plants for which I will be submitting
observations.
Occurrence Form
Species identification, habitat, location, date
Method of identification
“I recognize …from prior determinations and
experience”
“I compared this plant with herbarium specimens”
“I keyed this plant in a botanical reference”
“I compared … with published taxonomic
descriptions”
“An expert reviewed and confirmed this identification”
Certainty of identification
“I am confident of this identification, and submit this
as a positive observation.”
“I am not certain of this identification but believe it to
be a significant observation and submit it here as an
alert only.”
Annotations
Herbarium practice: experts annotate
records with corrections, comments.
CalFlora: registered experts can annotate
photos and occurrence records.
Annotation by an expert raises the credibility
of a record.
Actually – how often?
CalFlora Data and Trust
Trusting data
Every observation trackable to source(s)
Detailed info & contact info for source, observer
Detailed info about observation
Observations categorized by type
Annotation
Trusting users
NOT registering or charging users
Respecting source’s limits, caveats on data
Leaving quality decisions to the users
Trusting CalFlora
Detailed list of contributing organizations, advisors
NOT affiliated with another organization
Concerns
CalFlora relies on record-by-record examination
Looking at methods of classifying records in
collections
CalFlora relies on voluntary contributions of data
Experts with lots of data and no time to contribute
Well-meaning volunteers with time but not expertise
Users need to be able to track back to source of each
record, each data point
Concern about “modalities,” uncertainties being lost
Archiving
Concern about dynamicism of CalFlora
Stability of electronic media
Stability of the organization
Delegating decisions about quality of observations to
(inexpert) users
Implications for DLs, Other Info
Systems
The social nature of knowledge
We have to decide on whom we will depend
We learn from others whom and what we can depend
on
Information must be credible to be used
The importance of culture in constituting
knowledge
Practice, values, orientations…
Epistemic cultures differ
Not simply a matter of experts vs. public
Therefore:
DLs need to accommodate practices
Incl. practices of trust and credibility
Users need to know provenance of data
Users differ
and not just experts vs. nonexperts
DLs serve multiple, varied epistemic cultures
Same person,multi cultures
Users need flexibility to accommodate the DL to
their needs, practices
Some users need decisions made for them
>> involvement of users in design
Implications for
DL Creation and Management
Different epistemic cultures participate in the
design and management of DLs, as well
Librarians
Technologists
Various, differing user groups
Differences in practices, understandings, values
>> differences in priorities and decisions
A continual process of negotiation and
translation
`