Transcript Athletic Goals
Flying to the Top, One Tweet at a Time: Using Social Media to Rank Online Search Results
Robyn B. Reed, MA, MLIS Co-authors: Carrie L. Iwema, PhD, MLS Ansuman Chattopadhyay, PhD Health Sciences Library System University of Pittsburgh
Molecular Biology Information Service Workshops Consultations Website Software Licensing
Online Bioinformatics Resources Collection (OBRC)
http://www.hsls.pitt.edu/obrc/
Resources displayed by keyword ranking
http://www.hsls.pitt.edu/obrc/
Challenges:
Many tools exist and increasing in number User may retrieve several resources Common question – How do I know which one(s) to use?
Goal:
Provide up-to-date ratings of most highly regarded resources in bioinformatics
Objectives:
Using social media, design ranking system of OBRC resources Determine if social media results reflect opinions of bioinformatics experts
Why use the social media??
• No official rankings of bioinformatics tools • Opinions of several people • Social media data has many applications
http://beta.socialguide.com/
Methodology
Wrote 5 research questions Common bioinformatics queries Each question listed 3 possible resources to accomplish that task
Methodology Research questions
Experts (2) independently ranked resources Resources were ranked using social media data
Methodology – Social Media Ranking
Sources used for data collection Google Blogs Google Discussions Google Discussions includes • Forums • Groups • Comments www.google.com
Methodology – Data Sources
Twitter considered and removed • • 50% of the resources had zero Tweets 20% captured non-specific Tweets Facebook not included • Concern over private settings
Methodology – Social Media Ranking
Searched “all time” Optimized for most accurate retrieval • • • Resource in quotes Increased specificity, decreased noise Fewer hits
Methodology – Search Filter
• • Put all OBRC resources in bioinformatics context Automate the searches [(“ucsc genome browser”) AND ( bioinformatics | genome | genetics | genomics | computer | algorithm | software | server | database | computer model | protein | proteomics | proteome | gene | DNA | RNA | sequence | alignment | interactions | structure | modeling | prediction | biochemistry | molecular biology | systems biology | computational biology)] Example of search of UCSC genome browser
Results Bioinformatics Tools CPHmodels 3-D protein prediction ESypred3D SWISS-MODEL IDT SciTools PCR primer design Primer3 Primer Design Assistant DIANA-microT microRNA target design miRGator siRNA target finder Ambion ClustalW multiple sequence alignment ECR Browser Tcoffee Ensembl genome browsers NCBI Map Viewer UCSC Genome Browser Blogs + Discussion Raw Numbers 49 17 228 Social Media Rank Expert 1 Rank Expert 2 Rank 2 3 2 3 2 3 1 1 1 4 728 0 2 1 3 2 1 3 2 1 3 12 9 3 1 2 3 1 2 3 2 3 1 1494 8 63 3070 56 928 1 3 2 1 3 2 1 3 2 3 2 1 3 1 2 2 3 1
Results Bioinformatics Tools CPHmodels 3-D protein prediction ESypred3D SWISS-MODEL IDT SciTools PCR primer design Primer3 Primer Design Assistant DIANA-microT microRNA target design miRGator siRNA target finder Ambion ClustalW multiple sequence alignment ECR Browser Tcoffee Ensembl genome browsers NCBI Map Viewer UCSC Genome Browser Blogs + Discussion Raw Numbers 49 17 228 Social Media Rank Expert 1 Rank Expert 2 Rank 2 3 2 3 2 3 1 1 1 4 728 0 2 1 3 2 1 3 2 1 3 12 9 3 1 2 3 1 2 3 2 3 1 1494 8 63 3070 56 928 1 3 2 1 3 2 1 3 2 3 2 1 3 1 2 2 3 1
Results Bioinformatics Tools CPHmodels 3-D protein prediction ESypred3D SWISS-MODEL IDT SciTools PCR primer design Primer3 Primer Design Assistant DIANA-microT microRNA target design miRGator siRNA target finder Ambion ClustalW multiple sequence alignment ECR Browser Tcoffee Ensembl genome browsers NCBI Map Viewer UCSC Genome Browser Blogs + Discussion Raw Numbers 49 17 228 Social Media Rank Expert 1 Rank Expert 2 Rank 2 3 2 3 2 3 1 1 1 4 728 0 2 1 3 2 1 3 2 1 3 12 9 3 1 2 3 1 2 3 2 3 1 1494 8 63 3070 56 928 1 3 2 1 3 2 1 3 2 3 2 1 3 1 2 2 3 1
Conclusions:
This system can be used to determine highly regarded tools Explain that rankings are subjective; try the top 3-5 resources Provides patron with a starting point when using the OBRC
Limitations
• Quotation marks can be limiting if resource >1 word • Very small part of the total social media • “Negative” discussion about a resource
Future Directions
• Test > 3 bioinformatics tools/category • Increase number of expert ratings • Test applicability of system in areas other than bioinformatics
Special thanks to: Project collaborators and experts: Ansuman Chattopadhyay, PhD Carrie Iwema, PhD, MLS Research and academic advisors: Nancy Tannery, MLS Rebecca Crowley, MD, MS Funding from the Pittsburgh Biomedical Informatics Training Program NLM Grant 3 T15 LM007059-23S1