Transcript PowerPoint

Discussion Class 6
Lucene
1
Course Administration
Midterm Examination on October 12
Wednesday, October 12, 7:30 to 9:00, Upson B17
Discussion Class on October 19
This class will be held in Philips Hall 213
2
Discussion Classes
Format:
Question
Ask a member of the class to answer.
Provide opportunity for others to comment.
When answering:
Stand up.
Give your name. Make sure that the TA hears it.
Speak clearly so that all the class can hear.
Suggestions:
Do not be shy at presenting partial answers.
Differing viewpoints are welcome.
3
Question 1
(a) How did you go about looking for information about
Lucene? Did you search or browse? What information
sources did you find most useful? Is there some
information that you were unable to find?
(b) Who created Lucene? What is the business model?
4
Question 2
(a) What are the underlying search mechanisms supported by
Lucene?
(b) What algorithms does it use?
(c) What data structures does it use?
(i) To store and index fielded data
(ii) To maintain term weights
5
Question 3
(a) How do you load free text into Lucene?
(b) How do you load fielded text?
(c) What format options are there?
(d) How does it handle various character sets, stoplists,
stemming, etc.?
6
Question 4
(a) Suppose that you are unhappy with the ranking of results
provided by Lucene. What can you do about it?
(b) If you wanted to modify Lucene to support a novel search
algorithm, how would you go about it?
7
Question 5
How do you incorporate Lucene queries and results into your
own user interface?
8