Recent developments in patents statistics and data bases

Download Report

Transcript Recent developments in patents statistics and data bases

Recent developments in
patents statistics and data bases
at EPO and OECD
EPIP – Bocconi
February 24-25, 2006
Dominique Guellec
OECD
Structure of the presentation
1. Patents databases for statistical uses:
Patstat
2. Patent indicators for macroeconomic
analysis: Patent families
Patents databases: Current situation
Difficulty to access patent data; each
analysts/researcher to set up his/her own
database extracted from data published by
patent offices
=> high cost, duplication of costs,
=> uneven quality,
=> absence of standardisation,
=> lack of transparency.
Purpose of Patstat
Patstat is a response to the current needs.
It is a database of patents designed for
serving statistical purposes => format
compatible with SQL, SAS etc.
Could be used for compiling indicators or
conducting analytical work (policy,
academic)
Contents
• Documentation: Data coming from 73
offices world wide, since beginning of the
20th century for certain offices.
• Post grant data: About 40 offices.
Variables in Patstat
• Application information (dates, numbers)
• Applicants information
• Inventors information
• Priorities information
• IPC classes information
• National patent classes information
• Publications information
• References (citations) information (about 10 countries)
• Licence information
• Entry into force information (by country)
• Lapse information (by country)
Cleaned names
• Current effort sponsored by Eurostat for
cleaning the name of applicants at EPO
and USPTO (correcting misspellings etc.).
• Cleaned names will be made available in
Patstat.
Patstat sources
EPO sources:
• DocDB
• PRS
• EPASYS
• CDS
Other sources:
• US publications
• EUROSTAT name mapping
• ...
An evolving product
• Will adapt to needs expressed by users
• More variables could be added, e.g.
procedural data in EPO etc.
• Construct various tools for manipulating
the data and complementary tables (e.g.
families, citations)
Contribution of users
• Checking quality (more than 50 million
records) => reporting defaults to EPO!
Further needs:
• Cleaning names for non western
companies (Asia)
• Cleaning SMEs names
• Consolidating groups of enterprises
Conditions of access
• The first complete version is to be issued
early April 2006. Then twice updates per
year.
• Available to all users committing to non
commercial use and no further
dissemination of the data.
A hub
Patstat will find its place in the growing
industry of patents databases: due to its
harmonised priority numbers, it could be
used as a pivot to match data from various
patent offices – hence allowing
“harmonised diversity”.
Patent families
Patent indicators
As an extremely rich source of information,
patents can be used as indicators
reflecting the technological activity of
countries => location of R&D, circulation of
knowledge, co-operation in R&D,
specialisation, technological performance
etc.
BUT... possible noise and biases in the data
make necessary elaborated filtering.
Sources of noise and bias
• Patents are complex entities... various types of
titles (e.g. applications vs. grants, priorities vs.
divisionals), cross country differences and
changes over time in legal systems.
• Heterogeneity in value (highly skewed
distribution).
• Patenting strategy of companies create
distortions in the data (e.g. cross industry
differences in propensity to patents, home bias
etc.) => patent data reflect competitive strategy
rather than just technology?
Country shares of patents applied for at the EPO and
patent grants by the USPTO for priority year 1997
(Source: OECD)
100
%
8.0
Other countries
10.1
16.4
80
46.6
European
Union
60
52.8
40
28.7
United
States
20
20.7
16.7
0
EPO
USPTO
Japan
Grants or applications?
Country shares in JPO patents, 2004, %
Applications
Grants
Europe
4.8
4.3
US
5.2
3.8
Japan
83.0
90.7
Others
7.0
1.2
One candidate as a solution:
Triadic families
• A patent family is a set of applications or
patents filed in different offices to protect a
same invention.
• A Triadic family (OECD definition) is a set
of applications at the EPO and JPO and
grants by USPTO which share one or
more priorities.
Advantages of patent families
Address two issues in patent counts
=> heterogeneity in value
=> cross country biases
Heterogeneity in value
• Patents filing is costly (fees, translation,
attorney, enforcement) => applicants are
selective: Filing in several jurisdiction
should be justified by expected value.
• Members of triadic families are more cited
than other patents, have more claims etc.
Home advantage
• Families are measured on a more neutral
ground than applications filed in a single
jurisdiction.
Countries shares in patents indicators
Priority year 1999, % (Source: OECD)
%
100
Japan
8.3
80
United States
European Union
7.0
32.4
Other countries
10.9
15.9
46.5
60
34.0
40
52.6
27.8
20
17.4
26.6
20.6
Triadic patent families
USPTO
0
EPO
Technical problems in compiling
patent families
• No one to one correspondence between filings
in different countries (e.g. two JPO priorities will
make one USPTO application and one EPO
application), plus problem with divisionals etc.
=> family counts could be biased if one counts
ALL priorities.
• OECD solution = "consolidation": All applications
sharing one or more priorities are counted as
ONE family.
The impact of consolidation on family number
(source: OECD)
60 000
Basic patent
families A
50 000
40 000
Consolidation
filter
30 000
Consolidated
patent families A*
20 000
10 000
0
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
Future developments
•
•
•
•
•
US applications instead of grants
Cross-industry biases
Families citations
Improving timeliness (nowcasting)
Extending to more jurisdictions than EPO,
JPO and USPTO?