USING THE PAST TO SCORE THE PRESENT: EXTENDING TERM WEIGHTING MODELS WITH REVISION HISTORY ANALYSIS Ablimit Aji, Yu Wang Eugene Agichtein, Evgeniy Gabrilovich Oct.
Download ReportTranscript USING THE PAST TO SCORE THE PRESENT: EXTENDING TERM WEIGHTING MODELS WITH REVISION HISTORY ANALYSIS Ablimit Aji, Yu Wang Eugene Agichtein, Evgeniy Gabrilovich Oct.
USING THE PAST TO SCORE THE PRESENT: EXTENDING TERM WEIGHTING MODELS WITH REVISION HISTORY ANALYSIS Ablimit Aji, Yu Wang Eugene Agichtein, Evgeniy Gabrilovich Oct. 28, 2010 1 Revisions of “Topology” on Wikipedia 1st revision: 250th revision: Current revision: 2 Observable Document Generation Process #i-1 In mathematics, '''topology''' is a branch concerned with the study of topological spaces. Roughly speaking, topology is the study of geometric objects without considering their dimensions. 95th revision #i In mathematics, '''topology''' is a branch concerned with the study of topological spaces. Topology is also concerned with the study of the so called topological properties of figures, that is to say properties that does not change under a bicontinuous one-toone transformation (call homeomorphisms 96th revision 3 How Revision History Analysis Could Help Retrieval Revision History Analysis 4 Selected Prior Work • J. Elsas and S. Dumais. Leveraging temporal dynamics of document content in relevance ranking. In Proc. of WSDM,2010. • M. Efron. Linear time series models for term weighting in information retrieval. JASIST, 2010. • J. He, H. Yan, and T. Suel. Compact full-text indexing of versioned document collections. In CIKM, New York, NY, USA, 2009. 5 Revision History Analysis (RHA) RHA redefines term frequency (TF): - TF is a key indicator of document relevance - TF can be naturally integrated into ranking models 𝑆 𝑄, 𝐷 = BM25 𝑇𝐹 𝑡, 𝐷 ∙ 𝑘1 + 1 𝐼𝐷𝐹 𝑡 ∙ 𝑡𝜖𝑄 𝑇𝐹 𝑡, 𝐷 + 𝑘1 1 − 𝑏 + 𝑏 ∙ 𝑆 𝑄, 𝐷 = 𝐷(𝑄| 𝐷 = Language Model 𝑃 𝑡|𝑄 ∙ log 𝑡𝜖𝑉 𝐷 𝑎𝑣𝑔𝑑𝑙 𝑃 𝑡𝑄 𝑃 𝑡𝐷 6 Model 1: Steady growth First revision Current version Topology, in mathematics, is both a structure used to capture the notions of continuity, connectedness and convergence, and the name of the branch of mathematics which studies these. Topology (from the Greek τόπος, “place”, and λόγος, “study”) is a major area of mathematics concerned with spatial properties that are preserved under continuous deformations of objects, for example ….. basic examples include compactness and connectedness 7 Model 1 (continued) 8 RHA Global Model: definition Define the term frequency over the whole document generation process – a document grows steadily over time – a term is relatively important if it appears in the early revisions. 𝑛 𝑇𝐹𝑔𝑙𝑜𝑏𝑎𝑙 𝑡, 𝑑 = 𝑗=1 𝑐(𝑡, 𝑣𝑗 ) 𝑗𝛼 Frequency of term 𝑡 in revision 𝑣𝑗 Decay factor 9 But… Some pages are different: “Avatar(2009 film)” 1st revision: 500th revision: Current revision: 10 Model 2: Bursty Growth Burst of Document (Length) & Change of Term Frequency Term Frequency Time Document Length “Pandora” “James Cameron” Nov. 2009 9 23 2576 Dec. 2009 25 50 6306 Burst of Edit Activity & Associated Events Month (2009) Jul. Aug. Sep. Oct Nov. Dec. Edit Activity 89 224 67 154 232 1892 First photo & trailer released Movie released Global Model might be insufficient 11 RHA Burst Model: Definition • A burst resets the decay clock for a term. • The weight will decrease after a burst. 𝑚 𝑛 𝑇𝐹𝑏𝑢𝑟𝑠𝑡 𝑡, 𝑑 = 𝑗=1 𝑘=𝑏𝑗 𝑐(𝑡, 𝑣𝑘 ) (𝑘 − 𝑏𝑗 + 1)𝛽 Frequency of term 𝑡 in revision 𝑣𝑘 Decay factor for jth Burst 12 Burst Detection (1): Content-based Relative content change ℬ𝑐 𝑣𝑗 = 1, 0, Δ𝑐 potential burst |𝑣𝑗 |−|𝑣𝑗−1 | |𝑣𝑗−1 | >𝛼 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Content-based Burst for “Avatar” 13 Burst Detection (2): Activity Based Intensive edit activity ℬ𝑎 𝑒𝑝𝑗 1, = 0, Δ𝑡 potential bursts Average revision counts 𝑟𝑒𝑣𝑖𝑠𝑖𝑜𝑛 𝑐𝑜𝑢𝑛𝑡𝑠 𝑖𝑛 Δ𝑡 > 𝜇 + 𝜎 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Deviation Activity-based Burst for “Avatar” 14 Burst Detection (3): Combined Model 15 Putting it All Together: RHA Term Frequency --Combining global model and burst model RHA Term Frequency: 𝑇𝐹𝑟ℎ𝑎 𝑡, 𝐷 = 𝜆1 ∙ 𝑇𝐹𝑔 𝑡, 𝐷 + 𝜆2 ∙ 𝑇𝐹𝑏 𝑡, 𝐷 + 𝜆3 ∙ 𝑇𝐹 𝑡, 𝐷 𝜆1 + 𝜆2 + 𝜆3 = 1 𝜆1 , 𝜆2 𝑎𝑛𝑑 𝜆3 indicate the weights of RHA global model, burst model and original term frequency (probability). 16 Integrating RHA into Retrieval Models BM25 + RHA 𝑆 𝑄, 𝐷 = 𝐼𝐷𝐹 𝑡 ∙ 𝑡𝜖𝑄 𝑇𝐹 𝑇𝐹 𝐷 𝑟ℎ𝑎 𝑡, 𝐷 𝑇𝐹 𝑡, 𝐷 𝑇𝐹 𝐷 𝑟ℎ𝑎 𝑡, ∙ 𝑘1 + 1 + 𝑘1 1 − 𝑏 + 𝑏 ∙ 𝐷 𝑎𝑣𝑔𝑑𝑙 Statistical Language Models + RHA 𝑆 𝑄, 𝐷 = 𝐷(𝑄| 𝐷 = 𝑃 𝑡|𝑄 ∙ log 𝑡𝜖𝑉 𝑃 𝑡𝑄 𝑃𝑃𝑟ℎ𝑎 𝑡 𝐷𝑡, 𝐷 RHA Term Probability: 𝑃𝑟ℎ𝑎 𝑡, 𝐷 = 𝜆1 ∙ 𝑃𝑔 𝑡, 𝐷 + 𝜆2 ∙ 𝑃𝑏 𝑡, 𝐷 + 𝜆3 ∙ 𝑃 𝑡, 𝐷 17 Experimental Setup 18 Datasets INEX: well established forum for structured retrieval tasks (based on Wikipedia collection) TREC: performance comparison on different set of queries and general applicability INEX 65 topic TREC 68 topic Wiki Dump Top 1000 retrieved articles 1000 revisions for each article Corpus for INEX Top 1000 retrieved articles 1000 revisions for each article Corpus for TREC 19 Results 20 INEX Results Model bpref MAP R-precision BM25 0.354 0.354 0.314 BM25+RHA 0.375 (+5.93%) 0.360 (+1.69%) 0.337 (+7.32%) LM 0.357 0.370 0.348 LM+RHA 0.372 (+4.20%) 0.378 (+2.16%) 0.359 (+3.16%) Parameters tuned on INEX query Set BM25: 𝜆1 = 0.3 , 𝜆2 = 0.4, 𝜆3 = 0.3 LM: 𝜆1 = 0.3 , 𝜆2 = 0.2, 𝜆3 = 0.5 21 TREC Results Model bpref MAP NDCG BM25 0.524 0.548 0.634 BM25+RHA 0.547** (+4.39%) 0.568 ** (+3.65%) 0.656** (+3.47%) LM 0.527 0.556 0.645 LM+RHA 0.532 (+0.95%) 0.567 (+1.98%) 0.653 (+1.24%) parameters tuned on INEX query Set, ** indicates statistically significant differences @ the 0.01 significance level with two tailed paired t-test BM25: 𝜆1 = 0.3 , 𝜆2 = 0.4, 𝜆3 = 0.3 LM: 𝜆1 = 0.3 , 𝜆2 = 0.2, 𝜆3 = 0.5 Lab members manually labeled top 20 results for each topic 22 Performance Analysis Performance Improvements on bpref for BM25+RHA over baseline (BM25) INEX TREC INEX: significant improvement on 40% queries TREC: significant improvement on 37% queries Ex: “circus acts skills” , “olive oil health benefit” (+20% BM25 ,+11% LM improvement) 23 Summary o RHA captures importance signal from document authoring process. o Introduced RHA term weighting approach o Natural integration with state of the art retrieval models. o Consistent improvement over baseline retrieval models 24 Thank you! Using the Past to Score the Present: Extending Term Weighting Models with Revision History Analysis Ablimit Aji, Yu Wang, Eugene Agichtein, Evgeniy Gabrilovich Research partially supported by: 25 Query Sets and Evaluation Metrics • Queries and Labels: – INEX: provided – TREC: subset of ad-hoc track • Metrics: – Bpref (robust to missing judgments) – MAP: mean average precision – R-prec: precision at position R 26 RHA in Statistical Language Models o 𝑃𝑟ℎ𝑎 𝑤, 𝐷 = 𝜆1 ∙ 𝑃𝑔 𝑤, 𝐷 + 𝜆2 ∙ 𝑃𝑏 𝑤, 𝐷 + 𝜆3 ∙ 𝑃 𝑤, 𝐷 o 𝑃𝑔 𝑤 𝐷 = 𝑛 𝑐(𝑤,𝑣𝑗 ) 𝑗=1 𝑗𝛼 𝑤∈𝐷 o 𝑃𝑏 𝑤 𝐷 = 𝑐(𝑤,𝑣𝑗 ) 𝑛 𝑗=1 𝑗𝛼 𝑐(𝑤,𝑣𝑘 ) 𝑛 𝑘=𝑏𝑗 (𝑘−𝑏 +1)𝛽 𝑗 𝑐(𝑤,𝑣𝑘 ) 𝑚 𝑛 𝑗=1 𝑘=𝑏𝑗 (𝑘−𝑏 +1)𝛽 𝑗 (Global Model) 𝑚 𝑗=1 𝑤∈𝐷 (Burst Model) o 𝜆1 +𝜆2 + 𝜆3 = 1 27 Cross validation on INEX Model bpref MAP R-precision BM25 0.307 0.281 0.324 BM25+RHA 0.312 (+1.63%) 0.291 (+3.56%) 0.320 (-1.23%) LM 0.311 0.284 0.348 LM+RHA 0.338 (+8.68%) 0.298 (+4.93%) 0.359 (+0.61%) 5-fold cross validation on INEX 2008 query Set Model bpref MAP R-precision BM25 0.354 0.354 0.314 BM25+RHA 0.363 (+2.54%) 0.348 (-1.70%) 0.333 (+6.05%) LM 0.357 0.370 0.348 LM+RHA 0.366 (+2.52%) 0.375 (+1.35%) 0.352 (+1.15%) 5-fold cross validation on INEX 2009 query Set 28