Document 7575069

Download Report

Transcript Document 7575069

“Cross-Media and Personalized Learning Applications on top of Digital Libraries”
20 September 2007, Budapest, Hungary
A Study of Web Logs for Personalizing the MultiLingual
Information Access to The European Library
M. Agosti1, T. Coppotelli1, G.M. Di Nunzio1, N. Ferro1,
E. van der Meulen2
1 Information
Management Systems (IMS) Research Group
Department of Information Engineering
University of Padua, Italy
2 The
European Library, The Netherlands
LADL
2007
Outline
 The European Library
 Web Logs
 MultiLingual Information Access (MLIA)
 Web Log Analysis
 Geographic provenance
 Collections usage
 Discussion
 Conclusion and Future work
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
2
LADL
2007
The European Library
http://www.theeuropeanlibrary.org/
• Provide access to most of the European National Digital Libraries
• Many different languages
• Resources can be both digital or bibliographical (books, posters, maps, sound
recordings, videos, etc.).
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
3
LADL
2007
The European Library
 The European Library collections are suitable for Learning/E-
Learning
 Quality / Reliability
 How the analysis of WEB Log can help in implementing MLIA
 Questions we posed at the beginning of our work:
 How can E-learning systems deal with collections in different languages?
 Are users interested in multilingual/cross-collections instruments?
 What are their preferences?
 Are them influenced by language?
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
4
LADL
2007
The European Library
 User interaction mainly on the client side
 Web Server logs
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
5
LADL
2007
HTTP Log: structure of the data
 W3C Extended Log File Format
2007-01-01 00:04:05 192.87.31.35 GET /Index.html - 80 - 70. *.*.*
Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+5.1;+.NET+C
LR+1.1.4322) http://www.google.com/search?sourceid=navclient&gfns=1&ie=
UTF-8&rls=PCTA,PCTA:200633,PCTA:en&q=read+tokyopop+books+online
cTargets=collections:a0000,collections:a0037,collections:a0200,collections:a
0141,collections:a0010,collections:a0035,collections:a0086,collections:a013
2,collections:a0067,collections:a0001,collections:a0062,collections:a0130,co
llections:a0163,collections:a0211,collections:a0194,collections:a0075,collec
tions:a0073,collections:a0066;+TELSESSID=d551tvd9legbq3rh4l23rjkgh7;+Are
CookiesEnabled=889;+cTargetsThemes=theme0 0 0 381 535 203
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
6
LADL
2007
Log analysis
On demand query
Logs
Log processing
 October 1st 2006 to April 30th 2007
 22,458,350 Requests
 209,900 different sessions reconstructed using cookies
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
7
LADL
2007
MLIA
 MultiLingual Information Access (MLIA)
 “possibility for the users of the system to access and search the
federated libraries in a personalized way that can allow them to access
the collections of documents in their mother tongue and in other
preferred languages”
 Issues:
 Interaction happens outside the system
 Logs contain mainly navigational and browsing activity
 No control over query sent
 MLIA require modifications both on TEL and Digital Libraries services
 Control over the central index
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
8
LADL
2007
MLIA
 Isolated Query Translation
 Translation and Retrieval are separated
Translation Component
Retrieval Component
Translation
 Pseudo-translation
Index
 Central index translated
Index
20 September 2007, Budapest, Hungary
translation
Index
Index
Index
Index
Index
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
9
LADL
2007
MLIA
 Language to language context (+400 language resources)
 Pivot language
 Does the user like the query translation approach?
 Poor interaction versus rich interaction
 When should we prefer direct translation?
 User geographical distribution
 Collection usage
 Language to language preferences
 Promoting usage
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
10
LADL
2007
20 September 2007, Budapest, Hungary
Session length
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
11
LADL
2007
Geographic provenance
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
12
LADL
2007
Collections usage
Default list
First time user
Collection selection
*a0000
Online books, images, maps, music...
*a0037
British Library integrated catalogue
*a0141
BN-OPALE PLUS, the catalogue of the Bibliothèque nationale de
France
*a0010
Online catalogue of the German National Library
140000
*a0132
General Catalogue Koninklijke Bibliotheek
120000
*a0001
KatNUK: the catalogue of the Slovene National and University
Library
100000
*a0067
HELVETICAT : the catalogue of the Swiss National Library
160000
80000
60000
40000
0
*a0000
*a0037
*a0141
*a0010
*a0132
*a0001
*a0067
*a0211
*a0200
*a0086
*a0163
*a0062
*a0035
*a0130
*a0073
*a0075
*a0066
*a0229
*a0195
*a0224
*a0232
*a0047
a0194
*a0186
a0142
a0063
a0005
a0055
a0129
a0107
20000
Most used collections
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
13
LADL
2007
Collections usage
12000
10000
8000
6000
4000
0
a0142
a0063
a0005
a0055
a0129
a0107
a0166
a0070
a0171
a0002
a0175
a0167
a0193
a0150
a0123
a0155
a0188
a0082
a0109
a0124
a0014
a0019
a0004
a0104
a0157
a0154
a0136
a0031
a0057
2000
Most selected collections
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
14
LADL
2007
Collections usage
1600
1400
1200
1000
800
600
400
200
a01
42
a00
63
a00
05
a00
55
a01
29
20 September 2007, Budapest, Hungary
LTU
CAN
ES T
ARG
CZ E
AUT
NLD
GB
R
CH
E
GR
C
RO
M
HU
N
PRT
ITA
ES P
BE L
PO
L
DEU
USA
F RA
0
a01
07
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
15
LADL
2007
Conclusions
 How can E-learning systems deal with collections in different
languages?
 Isolated query translation versus pseudo-translation
 Are users interested in multilingual/cross-collections
instruments?
 Data showed that there is a demand for multilingual contents and users
are interested in more multilingual functionalities
 What are their preferences?
 The achieved results allowed The European Library to better know users
preference about multilingual resources (user distribution, collection
selection, …)
 Are them influenced by language?
 There is a correlation between language and user behavior
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
16
LADL
2007
Future work: new questions
 Do users navigate the portal displaying data in their mother
tongue or do they prefer to use the default language (English)?
 80% use the portal as it is
 Do English users search only on English collections?
 Query language not present in HTTP logs
 Action logs
 53475 sessions, 15674 advanced searches, 783 searches based on language
 Most ENG,FRE, GER, ITA
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
17
LADL
2007
 Question?
20 September 2007, Budapest, Hungary
M. Agosti, T. Coppotelli, G.M. Di Nunzio, N. Ferro, E. van der Meulen
18