Presentatie Gert Francois (De Persgroep)

Download Report

Transcript Presentatie Gert Francois (De Persgroep)

ACAP for online
newspapers
ACAP
Presentation to NUV, Amsterdam
14/5/2008
Gert François
Group Manager New Media, De Persgroep, BE
Member TWG ACAP
1
ACAP for online
newspapers
De Persgroep is a Belgian media concern, traditionally active
in printmedia and broadcast media
Newspapers
Belgium
Netherlands
2
consumer
magazines
Television
Radio
ACAP for online
newspapers
most read magazine in Flanders
450.000
393.496 copies sold on a weekly basis
400.000
350.000
Week after week, Dag Allemaal
reaches one Flemish person in 3
300.000
250.000
200.000
150.000
100.000
50.000
0
'85 '86 '87 '88 '89 '90 '91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06
the biggest newspaper in Flanders
310.000
290.000
287.857 copies sold on a daily basis
287.857
270.000
HLN: reaches 1 in 5 Flemish people on
a daily basis
250.000
230.000
210.000
190.000
170.000
150.000
'90 '91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04 '05 '06
3
ACAP for online
newspapers
market leader in Flanders on the commercial TV
market
24,0 23,8
Main shopper, 18-54
Main shopper, 18-44
10,0
11,5
9,0
9,3
5,6
5,5
3,2
3,4
biggest radio channel in Flanders
Other
12,9
Qmusic
27,3
Donna
22,3
4FM 9,9
StuBru
12,7
Radio 1
5,3
Radio 2
9,6
4
Radio in Flanders: market shares
18-44 year olds
ACAP for online
newspapers
Short history in new media (since 2001)
News sites
5
Vortals
Classified sites
350.000
325.000
300.000
275.000
unique reach per day: 340.951
unique reach per month: 4.627.599
250.000
225.000
200.000
175.000
150.000
125.000
100.000
75.000
50.000
25.000
0
aug/03 nov/03 feb/04
mei/04 aug/04 nov/04 feb/05
mei/05 aug/05 nov/05 feb/06
mei/06 aug/06 nov/06 feb/07 mei/07 sep/07
(based on CIM november 2007)
6
ACAP for online
newspapers
Search engines: the hate-love relationship
7
ACAP for online
newspapers
ACAP began with the search engines:
• killer application of the internet
• the value of search engines to users and to news
publishers is undeniable
• bottom line, there is a “positive” business relationship:
they deliver traffic
8
ACAP for online
newspapers
The power of search engines is still growing …
9
ACAP for online
newspapers
Their value for a newssite in euros (an estimate)
+ potential advertising revenues from their daily traffic
+ direct daily revenues from AdSense
_______________________________
daily value of search engines
10
ACAP for online
newspapers
potential advertising revenues from their daily traffic (an
estimate)
10 % of 300.000 = 30.000 visitors a day
Average of 5 page views per visitor = 150.000 page
views
CPM for a news site = 18 €
Advertisement potential of 2700 euros a day
11
ACAP for online
newspapers
Their value for a newssite in money (an estimate)
+ 2700 euros
+ 1000 euros (estimation)
_______________________________
3700 euros / day
12
ACAP for online
newspapers
But … :
1. The search engines decide what to display, how,
when and where … and they don’t seek positive
consent for their (new) activities
2. We can’t communicate our access policies
13
ACAP for online
newspapers
But … Google and Yahoo launched (controversial) new
services
14
ACAP for online
newspapers
1. stories of “old” and “new” online industry
15
ACAP for online
newspapers
2. GoogleNews is experienced as a competitor
GN is smaller than YN – why
the buzz?
• because it’s Google
• GN is based on computer
formula, not organised by
human editors
• GN offers no original
content or much licensed
content
16
ACAP for online
newspapers
3. copyright issues become more outspoken in this new
context
17
ACAP for online
newspapers
My conclusion
Regardless of the publisher’s stance, there is a need to
communicate the conditions of publication on the level of
“services”
18
ACAP for online
newspapers
But … we can’t communicate our access policies
CRAWLER
Robots.txt
DOCUMENT
REPOSITORY
19
SEARCH
INDEX
ACAP for online
newspapers
20
ACAP for online
newspapers
What are the current automatic means of communication
Robots.txt
• small file on the website
• de facto standard
supported by most important
robots
• is inconsistently applied by
the aggregators
• set of instructions too
limited
disallow
21
ACAP for online
newspapers
What are the current automatic means of communication
Meta tags
• embedded in HTML
• allows greater control
• set of instructions still
too limited
(no)index
(no)follow
noarchive
22
ACAP for online
newspapers
23
ACAP for online
newspapers
Conclusion
… they are not sufficient. For instance: how to express that I
don’t want my pictures indexed in google news?
24
ACAP for online
newspapers
Conclusion:
• there is a problem for the publisher: no way of expressing
permissions and rules regarding the access and use of their
valuable content
• there is a problem for the aggregator: how to comply with
those unspoken rules?
25
ACAP for online
newspapers
The consequences:
26
Source: http://www.marketingvox.com/archives/2006/02/02/google_accused_of_stealing_newspaper_content/
ACAP for online
newspapers
Source: http://www.marketingvox.com/archives/2006/07/19/lawsuit_against_google_news_not_dismissed/
27
ACAP for online
newspapers
Source: http://news.newamericamedia.org/news/view_article.html?article_id=a191005fc71ab3749c713f7cfbdcf58e
28
ACAP for online
newspapers
Source: http://www.sfnblog.com/index.php/2007/02/13/180-online-digital-copyright-belgium
29
ACAP for online
newspapers
Source: http://www.lunchoverip.com/2007/08/swiss-publisher.html
30
ACAP for online
newspapers
The solution:
“The answer to the
machine is in the
machine”
Charles Clark (former copyright adviser to the UK Publishers
Association)
31
ACAP for online
newspapers
The solution: a formal language
• a syntax and semantics
• both parties understand
• which allows automated handling
• based on the existing standards (robots.txt and META tags)
32
ACAP for online
newspapers
33
ACAP for online
newspapers
ACAP pilot: the newspaper use cases
34
ACAP for online
newspapers
Use case 1: communicate about “services”
there is a need to communicate the conditions of publication
on the level of “services”
• opt-in or opt-out
• about the appearance of the extracts (snippets)
• about the Images
• …
35
ACAP for online
newspapers
Use case 2: communicate no-archive
Online news publishers are inclined to disallow the cache
36
ACAP for online
newspapers
Older versions of a news article shouldn’t be kept in the cache
37
ACAP for online
newspapers
Because of revenue model based on subscription fee
38
ACAP for online
newspapers
Use case 3: enable crawler to access protected content
behind a pay wall – this implies crawler authentication
• e.g. content of hard copy newspaper
• There is a win situation for the online publishers to have their
content indexed by the search engines.
39
ACAP for online
newspapers
Use case 4: being able to specify the text of the snippet
40
ACAP for online
newspapers
The test architecture
41
Acap.persgroep.be
Acap1.exalead.com
Exalead
Test
Crawler
Crawl ExaBotAcap12
Acap2.exalead.com
42
ACAP for online
newspapers
The results …
43
ACAP for online
newspapers
Use case 1: communication about “services”
44
45
46
47
48
ACAP for online
newspapers
Use case 2: communication about “no-archive”
49
50
51
Use case 3: crawler authentication
52
Acap.persgroep.be
1
Exalead service 1
Crawl
Mechanism for crawler
recognition
if vIpAdres.equals
(trusted crawlers-ip’s) ->
match
Apache Logs
2
Trusted Ip’s
193.47.80.77
Analysis apache logs
53
Couples of IP’s & agent names
by dns lookup (ffwd & reverse)
Exalead
Crawl
Test
Crawler
54
55
56
57
ACAP for online
newspapers
Use case 4: communication about restrictions in
results
58
59
ACAP for online
newspapers
Conclusions
• ACAP works … proven by publishers and Exalead
• it’s rich but also difficult
• High commitment from the publishers
• It would be great to have some of the “Big 3” search
engines like Google, Yahoo and Microsoft more
engaged
• increase the political pressure as much as possible
60
ACAP for online
newspapers
• a lot of work still needs to be done
 refinement of ACAP 1.0 and creation of
implementation guide
 study of some use cases not tested in ACAP v 1.0
 location end-user
 fragments of a resource
 crawler identification/authentication
 automating take-down procedures
 syndication
 examination of ACAP policies in non-HTML resources
(pdf, images, video, …)
61
Thank you
Gert François
[email protected]
Group Manager New Media, De Persgroep
62