The EBI search engine: EB-eye Franck Valentin External Services group EMBRACE Workshop CBS, BioCentrum-DTU, February 6-8, 2008 EBI is an Outstation of the European Molecular.
Download ReportTranscript The EBI search engine: EB-eye Franck Valentin External Services group EMBRACE Workshop CBS, BioCentrum-DTU, February 6-8, 2008 EBI is an Outstation of the European Molecular.
The EBI search engine: EB-eye Franck Valentin External Services group EMBRACE Workshop CBS, BioCentrum-DTU, February 6-8, 2008 EBI is an Outstation of the European Molecular Biology Laboratory. Summary • • • • 2 The data at the EBI What is the EB-eye? A glance at the web interface Web services for the EB-eye 07.11.2015 Web Services Course, CBS, DK The data at the EBI ID ... AC ... DT ... ID ... AC ... DT ... ID ... AC ... DT ... <XML> . . . </XML> Ligand <XML> . . . </XML> Array Interpro Express <XML> . . . </XML> <XML> . . . </XML> ID : .. PARENT ID : .. RANK : .. ... 3 07.11.2015 Web Services Course, CBS, DK <XML> . . . </XML> <XML> . . . </XML> <XML> . . . </XML> The data at the EBI • Searching the data at the EBI • • • • 4 07.11.2015 Diversity and heterogeneity of the data (format, size, content…) Most of the data providers have their own search mechanism Heterogeneity of the search results (display, content, granularity…) Navigation between the different resources (references) not consistent Web Services Course, CBS, DK What is the EB-eye? • Global search mechanism • Searches most of the EBI resources in one go • Not specific to any resource • Unified searches of the EBI resources • Free-text search (unified semantic) • Basic results display (Google-like) • Simple cross reference navigation • Available on all the EBI web pages 5 07.11.2015 Web Services Course, CBS, DK A glance at the web interface 6 07.11.2015 Web Services Course, CBS, DK EB-eye results summary page • • • • 7 07.11.2015 Web Services Course, CBS, DK Organized into categories called “domains” Number of results per domain Refine your search Expand/Collapse for more details EB-eye domain result page • Results for all the resources in a domain • • • • Hierarchy of domains • • 8 07.11.2015 A domain can contain several resources First 3 entries displayed for each resource View more entries for a particular resource Forward search (smaller set of resources) Backward search (wider set of resources) • Refine your search • Navigate the results pages Web Services Course, CBS, DK EB-eye domain result page (one resource) • • • • 9 07.11.2015 Web Services Course, CBS, DK Basic information: ID, name, description… Link to the main resource web site Additional links EB-eye internal references EB-eye cross-references navigation • • • 10 07.11.2015 Web Services Course, CBS, DK Navigate inside the EB-eye References context Navigation… • Using resources explicit references • Using resources implicit references EB-eye Advanced Search • • • 11 07.11.2015 Web Services Course, CBS, DK Accessible from all the pages Simple search criteria Domain specific search • Domain selection • Fields selection • References Web services for the EB-eye • Simple experimental API for basic operations • • • • 12 Basic metadata information Basic queries (Full-text and entries) Limited cross-references navigation Depending on the usage, we may implement a more complex API and more functionalities 07.11.2015 Web Services Course, CBS, DK Web services – Listing the domains List available domains (list only the leaves) String[] listDomains() > listDomains() … astd … ensembl emblcds embldeleted emblnew_ann_con emblnew_con emblnew_standard emblnew_wgs emblrelease_ann_con emblrelease_con emblrelease_standard emblrelease_wgs ensembl … 13 07.11.2015 Web Services Course, CBS, DK Web services – Number of results Get number of results for a simple query int getNumberOfResults(String domain, String query) > getNumberOfResults(‘medline’, 'immunolog* nutrition') 6954 14 07.11.2015 Web Services Course, CBS, DK Web services – Get results ids List result IDs for a simple query String[] getResultsIds(String domain, String query) String[] getResultsIds(String domain, String query, int start, int size) > getResultsIds(‘uniprot’, ‘polymerase’, 0, 5) A2VB99_9VIRU Q86777_9CALI Q779J8_9VIRU Q8I944_9STIC Q8I945_9STIC 15 07.11.2015 Web Services Course, CBS, DK Web services – Get referenced domains Get referenced domains in a domain or an entry String[] getDomainsReferencedInEntry(String domain, String entryId) String[] getDomainsReferencedInDomain(String domain) > getDomainsReferencedInEntry(‘ensembl’, ‘cg2102’) embldeleted emblnew_ann_con emblnew_con emblnew_standard emblnew_wgs emblrelease_ann_con emblrelease_con emblrelease_standard emblrelease_wgs go taxonomy uniprot 16 07.11.2015 Web Services Course, CBS, DK Web services – Get referenced entries Get referenced entries for a domain in a particular entry String[] getReferencedEntries(String domain, String entryId, String referencedDomain) getReferencedEntries(‘ensembl’, ‘cg2102’, ‘go’) GO:0005634 GO:0046872 GO:0008270 GO:0016319 GO:0003676 GO:0003677 GO:0045892 GO:0006350 GO:0006355 GO:0007275 GO:0007399 GO:0007402 GO:0007417 GO:0007419 GO:0003700 GO:0009791 GO:0030154 17 07.11.2015 Web Services Course, CBS, DK Web services – External cross-references List non EB-eye domains referenced in a domain String[] listAdditionalReferenceFields(String domain) listAdditionalReferenceFields(‘msdpdb’) CATH PFAM SCOP 18 07.11.2015 Web Services Course, CBS, DK Web services – The fields XML files Flat files ID AF030562; SV 1; linear; genomic DNA; STS; FUN; 852 BP. AC AF030562; DT 04-DEC-1997 (Rel. 53, Created) DT 03-MAR-2000 (Rel. 62, Last updated, Version 2) XX DE Fusarium venenatum clone VEN-A RAPD band generated using Operon primer DE OPW-03, sequence tagged site. ... id (value stored) <MedlineCitationSet> <MedlineCitation Owner="NLM" Status="MEDLINE"> <PMID>10997935</PMID> <DateCreated> <Year>2000</Year> <Month>10</Month> <Day>04</Day> </DateCreated> … Db ID AF030562; SV 1; linear; genomic DNA; STS; FUN; 852 BP. XX acc (value stored) id (value stored) <MedlineCitationSet> AC AF030562; <MedlineCitation Owner="NLM" Status="MEDLINE"> creation_date /last_modificationdate (values non stored) XX <PMID>14216186</PMID> DT 04-DEC-1997 (Rel. 53, Created) <DateCreated> DT 03-MAR-2000 (Rel. 62, Last updated, Version 2) description (value stored) <Year>1965</Year> <database> (value non stored) creation_date XX <name>IntAct.Experiment</name> <Month>02</Month> <description>Experimental procedures that allowed to…</description> DE Fusarium venenatum clone VEN-A RAPD band generated using Operon primer <release>1.0</release> <Day>01</Day> <release_date>2007-Feb-16</release_date> DE OPW-03, sequence tagged site. <entry_count>5697</entry_count> </DateCreated> XX <entries> <DateCompleted> <entry id="EBI-77680"> KW STS. … <Year>1996</Year> last_modification_date (value non stored) organism_species (value non stored) XX <Month>12</Month> organism_classification (value non stored) OS Fusarium venenatum <Day>01</Day> OC Eukaryota; Fungi; Ascomycota; Pezizomycotina; Sordariomycetes; </DateCompleted> OC Hypocreomycetidae; Hypocreales; mitosporic Hypocreales; Fusarium. <DateRevised> XX <Year>2007</Year> RN [1] <Month>03</Month> RP 1-852 <Day>01</Day> RA Yoder W.T., Christianson L.M.; </DateRevised> RT "Species-specific primers resolve members of the section Fusarium. <Article PubModel="Print"> RT Taxonomic status of the edible 'Quorn' fungus re-evaluated"; issn (value non stored) <Journal> RL Fungal Genet. Biol. 0:0-0(1997). <ISSN IssnType="Print">0009-8981</ISSN> XX <JournalIssue CitedMedium="Print"> references (non stored) RN [2] <Volume>10</Volume> RP 1-852 volume (value stored) <PubDate> RA Yoder W.T., Christianson L.M.; <Year>1964</Year> RT ; <Month>Jul</Month> RL Submitted (21-OCT-1997) to the EMBL/GenBank/DDBJ databases. </PubDate> RL Microbiology, Novo Nordisk Biotech, Inc., 1445 Drew Ave., Davis, CA 95616, name (value non stored) </JournalIssue> RL USA <Title>Clinica chimica acta; international journal of clinical chemistry</Title> XX <ISOAbbreviation>Clin. Chim. Acta</ISOAbbreviation> FH Key Location/Qualifiers </Journal> FH ... FT source 1..852 ... FT /organism="Fusarium venenatum" FT /strain="ATCC20334" ... Dump file (XML) 19 07.11.2015 Web Services Course, CBS, DK Web services – The fields List available (stored) fields in a domain String[] listFields(String domain) listFields(‘uniprot’) acc_number description id name 20 07.11.2015 Web Services Course, CBS, DK Web services – Get results with fields List result fields values for a simple query String[][] getResults(String domain, String query, String[] fields, int start, int size) >getResults(‘uniprot’, ‘polymerase’, [‘acc’, ‘id’, ‘description’], 0, 5) acc description id ------------------------------------------------------------------A2VB99 Polymerase. A2VB99_9VIRU Q86777 RNA polymerase (Fragment). Q86777_9CALI Q779J8 Q0E5A0 DNA polymerase (EC 2.7.7.7). Q779J8_9VIRU Q8I944 DNA polymerase (EC 2.7.7.7). Q8I944_9STIC 21 07.11.2015 Web Services Course, CBS, DK Web services – Get result fields values for entries Get result fields values for one or several entries String[] getEntry(String domain, String entryId, String[] fields) String[][] getEntries(String domain, String[] entryIds, String[] fields) >getEntry(‘medline’, ‘7605758’, [‘description’, ‘publication_date’ , ‘authors’]) description : BACKGROUND AND OBJECTIVES: Intraspinally administered alpha 2-adrenergic agonists produce analgesia in part by causing spinal acetylcholine and nitric oxide (NO) release. Clonidine-induced analgesia is enhanced by subarachnoid neostigmine and inhibited by N-methyl-L-arginine (NMLA), a blocker of NO synthesis. The authors tested whether dexmedetomidine, an alpha [...] publication_date 1995 Mar-Apr : authors : Bouaziz H. Hewitt C. Eisenach J.C. 22 07.11.2015 Web Services Course, CBS, DK Web services – Get the urls http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-+[UNIPROT:Q9QUZ9_9MURI]+-newId returns the urls configured for a field of an entry String[] getEntryFieldUrls(String domain, String entry, String[] fields) String[][] getEntriesFieldUrls(String domain, String[]entries, String[]fields) getEntryFieldUrls(‘uniprot’, ‘Q9QUZ9_9MURI’, [‘id’]) http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[UNIPROT:Q9QUZ9_9MURI]+-newId 23 07.11.2015 Web Services Course, CBS, DK Web services – Referenced entries from a domain List of referenced entries from a domain referenced in a set of entries String[][] getReferencedEntriesFlatSet(String domain, String[] entries, String referencedDomain, String[] fields) dict(String[][]) getReferencedEntriesSet(String domain, String[] entries, String referencedDomain, String[] fields) >getReferencedEntriesSet(‘ensembl’, [‘AAEL005345’, ‘CG2102’], ‘go’, [‘id’, ‘name’]) ‘AAEL005345’-> [GO:0016319, [GO:0045892, [GO:0007417, [GO:0009791, ‘CG2102’-> [GO:0005634, [GO:0046872, [GO:0008270, [GO:0016319, [GO:0003676, [GO:0003677, ... 24 07.11.2015 ‘mushroom body development’], ‘negative regulation of transcription,DNA-dependent’], ‘central nervous system development’], ‘post-embryonic development’] ‘nucleus’], ‘metal ion binding’], ‘zinc ion binding’], ‘mushroom body development’],] ‘nucleic acid binding’], ‘DNA binding, Web Services Course, CBS, DK Web services – Links • WSDL: http://www.ebi.ac.uk/ebisearch/service.ebi?wsdl • Documentation: http://www.ebi.ac.uk/Tools/webservices/services/eb-eye • Feedback! http://www.ebi.ac.uk/support/ 25 07.11.2015 Web Services Course, CBS, DK Web services – Let’s play ! • 2 wrappers to hide the SOAP hassle • • • Test files to play with • • 26 EBeyeWSWrapper.pm EBeyeWSWrapper.py 07.11.2015 testEBeyeWSWrapper.pl testEBeyeWSWrapper.py Web Services Course, CBS, DK