An introduction to Solr

Download Report

Transcript An introduction to Solr

An introduction to Solr
Implementing search with free software
By Mick England
What is Solr?
 Solr is an open source enterprise search server based on the
Lucene Java search library.
 Solr runs in a Java servlet container such as Tomcat or Jetty
 Solr is free software and a project of the Apache Software
Foundation
 Solr is a sub-project of Lucene and can be found at
http://lucene.apache.org/solr/
By Mick England
Key Features
 Advanced Full-Text search
 Optimized for High Volume Web Traffic
 Standards Based Open Interfaces – XML and HTTP
 Comprehensive HTML Administration Interface
 Server statistics exposed over JMX for monitoring
 Scalability through efficient replication
 Flexibility with XML configuration and Plugins
 Push vs Crawl indexing method
By Mick England
Solr Clients
 Solr can be integrated with, among others…
 Ruby
 PHP
 Java
 Python
 JSON
 Forrest/Cocoon
 C# or Deveel Solr Client or solrnet
 Coldfusion
 Drupal or apacheSolr project for Drupal
By Mick England
Indexing
 Push vs Crawl
 Schema.xml
 Add documents
 HTML interface
 Update
 Delete
 Commit
 DataImportHandler
 For searching databases
By Mick England
Searching
 Full text search
http://localhost:8983/solr/select?q=Iraq
 Search only within a field
http://localhost:8983/solr/select?q=category:news
 Control which fields are displayed in result
http://localhost:8983/solr/select?q=video&fl=id,category
 Provide ranges to fields
http://localhost:8983/solr/select?q=price:[0
TO400]&fl=id,name,price
By Mick England
More Searching
 Faceting information
http://localhost:8983/solr/select?q=news&fl=id,description
&facet=true&facet.field=category
 More like this (MLT)
http://localhost:8983/solr/select?q=Iraq&mlt=true&mlt.fl=
headline&mlt.mindf=1&mlt.mintf=1&fl=id,score&rows=1
00
• More information on how this works and the options
available can be found at
http://wiki.apache.org/solr/MoreLikeThis
By Mick England
QueryResponseWriter
 A QueryResponseWriter is a Solr Plugin that defines the
response format for any request
 All of the requests we have made so far are formatted with
the XMLResponseWriter
 Other formats can be applied by appending wt=format to
the search string like this:
http://localhost:8983/solr/select?q=date:[1998%20TO%201
999]&fl=id,name,date,headline&rows=200&wt=xslt&tr=ex
ample.xsl
By Mick England
Acknowledgements
 Search smarter with Apache Solr, Part 1: Essential features
and the Solr schema
 http://www.ibm.com/developerworks/java/library/j-solr1/
 Solr Tutorial from Lucid Imagination
 http://www.lucidimagination.com/Community/Hear-from-
the-Experts/Podcasts-and-Videos/Solr-Tutorial
 Solr Wiki
 http://wiki.apache.org/solr/
By Mick England