MCD Status Report Trevor Watson 4 February 2008 MCD Status Report Some facts and figures about the main MCD processes : • loading of new.

Download Report

Transcript MCD Status Report Trevor Watson 4 February 2008 MCD Status Report Some facts and figures about the main MCD processes : • loading of new.

MCD Status Report
Trevor Watson
4 February 2008
MCD Status Report
Some facts and figures about the main MCD processes :
•
loading of new documents from 2006 (IPC8 Frontfile)
•
loading of IPC data for pre-2006 documents (IPC8 Backfile)
•
IPC Revision
4 Feb 2008
How much IPC data is there in the MCD ?
Level
No. of
IPC Symbols
Subclass
Core
Advanced
760,000
63.000.000
69.000.000
2006-08
(5m)
Subclass
Core
Advanced
22.000
9,300,000
9.800.000
Pre-2006
(55m)
Subclass
Core
Advanced
13
240.000
260.000
Location
Family
(37m)
Publication
(60m)
4 Feb 2008
% documents loaded in MCD with valid IPC8 (by pub date)
98
96,6
96
94,5
97,1
96,9
Q3 07
Q4 07
95,1
93,5
94
92,5
92
90
89,5
88
86
84
Q1 06
Q2 06
Q3 06
Q4 06
Q1 07
Q2 07
4 Feb 2008
Frontfile errors and warnings for documents
published in November 2007
Total number of documents
192.129
Advanced level symbols stored
409.627
E
Invalid symbols not stored
371
W
Version Indicator incorrect, value from Validity File taken
W
Classification value defaulted to 'N' or 'I'
540
W
Indexing code not allowed as invention information, changed to 'N'
233
W
Action date empty, or invalid, set to publication date.
38.443
9%
W
Original/Reclass indicator not 'B', 'R', 'V', or 'D', defaulted to 'B'
51.513
13%
W
Reclassified IPC provided as new pub data in front file
W
Source indicator not 'H', 'M', or 'G', defaulted to 'H'
W
Additional information delivered as 'F', changed to 'L'
36
W
More than one 'F' provided, defaulted to 'L'
35
25.710
6%
1.600
51.519
13%
4 Feb 2008
MCD backfile history (pre-2006 documents)
2005 - IPC8 derived from ECLA, IPC 1-7
(and incrementally in 2006, 2007)
2006 - AU, DE, JP, EAPO, LT, PT, SI, US
2007 - DE, MD, RO, RU
2008 - BR, CZ, (KR), UA
4 Feb 2008
Sources of backfile data
Sources of advanced family-level symbols - January 2008
ECLA
24.4m
(35%)
IPC1-7
9.6m
(14%)
Backfiles from NOs
35.1m
(51%)
Total
69.1m
Overall level of completeness (by document) around 91%
4 Feb 2008
Backfile completeness
•
overall figure of 91% can only be a rough guide
•
aspects of completeness
–
–
–
–
•
date
kind codes
technical area
quality (the difficult one !)
suggestion :
– each NO to produce a standard 'Statement of MCD Completeness'
for their own collection
4 Feb 2008
MCD revision life-cycle
2007.01
2007.10
2008.01
2008.04
Load RCL and VSF
Distribute working lists
Store results (with auto deactivation)
Automatic reclassification
x
x
Distribute residual working lists
Exchange results
Deactivate remaining symbols (?)
4 Feb 2008
Revision periods
2007.01
2007.10
2008.01
2008.4
Families on original working lists
43.359
9.888
26.352
5.800
Families on residual working lists
(02.02.08)
4.618
2.233
- docs that arrived after first WLS
- docs where new symbols not added
4 Feb 2008
Revision in practice
•
revision periods overlap more than expected
– MCD solution : weekly / monthly procedures
•
termination of revision period not clearly defined
4 Feb 2008
Sources of IPC8 data
•
EPOQUE and esp@cenet
•
regular DocDB bibliographic data products
•
quarterly IPC8 incremental DVDs
•
revision results files (yet to be released)
4 Feb 2008
Thank you for your attention !
[email protected]
4 Feb 2008