Patent Citation Networks.E

Download Report

Transcript Patent Citation Networks.E

Patent Citation Networks
• Bernard Gress
• http://student.ucr.edu/~gressb01
• Fannie Mae Inc., Washington DC.
• [email protected]
• Forthcoming in The Mathematica Journal
• http://www.mathematica-journal.com/
The Patent Citation Dataset
• Patent citations are part of the legal patent process where the patent
applicant has the duty to disclose any knowledge of 'prior art'
amongst previous patents.
• Some objectivity in the process is provided by the government
patent examiner who is supposed to be an expert in the area and
who approves the final citation.
• The network established by patent citations allows one to trace the
flow of technology through time, from patent to patent, and across
fields.
• Studies of technological spillover effects, the impact or influence of
individual patents, the rates of technological development, and other
such issues, can be assisted by the consideration of patent citations.
The Patent Citation Dataset continued
• Hall, Jaffe, and Trajtenberg, and the National Bureau of
Economic Research (NBER)
(http://www.nber.org/patents/).
• The primary database (cite75_99.zip) contains
22,309,440 pairs of pair-wise patent citation dataset on
more than 3 million U.S. patents granted between
January 1963 and December 2002.
• The secondary database (pat63_02f.txt) contains
records for 3,414,910 patents with 25 fields each.
Structure of Primary Database
(cite75_99.zip)
Structure of Secondary Database
(pat63_02f.txt)
Patent Numbers Issued Serially
Grant Date by Patent Number
16000
14000
12000
10000
8000
6000
4000
3.5 10
6
4 10
6
4.5 10
6
5 10
6
5.5 10
6
6 10
6
6.5 10
6
Two Types of Citation Networks
• A Citation Lineage
– all of the progenitors and descendants by
citation reference, so long as no siblings are
brought into the picture
• A Citation Neighborhood
– all those patents that are within a specified
network distance of the patent of interest,
regardless of relationship, including all
'siblings' and 'cousins'.
There are 14 nodes for the 1generation lineage of patent #3858382:
• PatentLineage[3858382,1]
– PatentsOfInterest {3858382},
– PrintRules {13858382, 21794517, 32045678, 42069266,
52790591, 63044233, 73100569, 83468100, 93646723,
104085822, 114316353, 124750694, 134863125,
145054646, 156250501}
– Relations {38583824085822, 38583824316353,
38583824750694, 38583824863125, 38583825054646,
38583826250501, 17945173858382, 20456783858382,
20692663858382, 27905913858382, 30442333858382,
31005693858382, 34681003858382, 36467233858382}
– Vertexes {3858382, 1794517, 2045678,2069266, 2790591,
3044233, 3100569, 3468100, 3646723, 4085822, 4316353,
4750694, 4863125, 5054646, 6250501}
– IndexPairs {{1,10},{1,11},{1,12}, {1,13},{1,14}, {1,15},
{2,1},{3,1}, {4,1},{5,1},{6,1},{7,1}, {8,1},{9,1}}
– IndexRules {110, 111, 112, 113, 114, 115, 21,
31, 41, 51, 61, 71, 81, 91}
There are 15 nodes for the
1-generation Neighborhood
of patent #3858382:
• PatentNeighborhood[3858382,1]
– PatentsOfInterest {3858382}
– PrintRules {13858382, 21794517, 32045678, 42069266,
52790591, 63044233, 73100569, 83468100, 93646723,
104085822, 114316353, 124750694, 134863125,
145054646, 156250501}
– Relations {17945173858382, 20456783858382,
20692663858382, 27905913858382, 30442333858382,
31005693858382, 34681003858382, 36467233858382,
38583824085822, 38583824316353, 38583824750694,
38583824863125, 38583825054646, 38583826250501}
– Vertexes {3858382, 1794517, 2045678, 2069266,2790591,
3044233, 3100569, 3468100, 3646723, 4085822, 4316353,
4750694, 4863125, 5054646, 6250501}
– IndexPairs {{1,10}, {1,11}, {1,12}, {1,13}, {1,14},
{1,15}, {2,1}, {3,1},{4,1},{5,1},{6,1}, {7,1}, {8,1},
{9,1}}
– IndexRules {110, 111, 112,113, 114, 115, 21,
31, 41, 51, 61, 71, 81, 91}
• Mathematica has Nice Built-in Graph
Visualization Functions for Unstructured
Graphs:
• GraphPlot
• GraphPlot3D
• ShowGraph
• But to Plot Graphs Over Time then Have
to Use My Function:
• PatentPlot
Citation Networks Over Time - continued
The 2-Generation Lineage of 3858382
Citation Networks Over Time - continued
The 2-Generation Neighborhood of 3858382
GraphPlot[PatentNeighborHood[
{3858382, 4597749}, 2]]
A nice illustration of the spread of technology over time.
1963
2790591 20692661863556
2842921
30509182721015
3075324
30442333100569
1794517
2985992
1964
2325917
2045678
3089732
1965
3210124
1967
3329260
3304619
3337064
3351388
1968
3431828
3439891 3455529
3495378
1971
3587410
3497086
3605383
3581459
1972
3640450
1973
3598043
3646723
3707826
3747298
1974
3810420
1975
3731472
3791267
38583823869065
3 897678
3890011
4062170
1978
4092817
4084397
4085822
4094125
1979
4139029
1980
4216050
1981
4285681
1982
4345412
1983
4403465
4386924
1984
4545844
4692134
4699608
1995
1996
1997
1998
1999
2000
2001
2002
4487313
4904092
5125887
4805969
4860982
4989398
5120553
52306885215275
5244450
5312317
4785938
4869458
5177939
5249410
5332094
5393293 5405021
5408805
4998694
5269423
5214905
5195613
5268969
5184728
5741208
5458851
5440863
5467572
5782073
5222702
5667173
5806815
5713181
5775055 5809743
5845467
5993368
5993942
5997177
5860529 5960617
5868901
6089514
6145709
6148587
6152407
6155521
5997643
6086023 6059707
6142364
62307816264059
629103762279946260333
6315705
6205755
6266945
6286293
6185915
6283178 6324818
6379291
6334290
6453646
6003288
5860905
6042063
6094895
5549423
5657620
5765348 5788425
5794866
5794673
5901538
61526566026869
6094822
6033154
6196717
6295790
6250543 6282871
6176366
6313444
6254521
62505016237878 6171024
6402098
6389786 6491218 6347499
6431407
6460814
6375131
6374579
6363694
6487835 6385951
5103882
5875700
4899522
5129765
5233281 5201350
5567091 5502946
5775869
5799465
5809745
5826405
5725670
5833061
57716675819509
5868364
5987854
5871115
5857310 5863130
4955762
5037245
5419452
5454207 5413157 5478173
5407400
5478174
5692689
58028135848625
5890347
4925438 4946488
5329752
5561967 5484091
5546732
5513479
5503355
5535792
4730956 4732513
4812085 4828434
5019058
5105969
51019385170825
5606844
5810706
5976315
5018637
5356397
5400989
4655677
4770565
4815255
5213145
5190253
5176308 52615535226858
5349996
4710068
4760684 4769977
4921193
4932560
5018691
4433764
4561806
4582484
4696150
4821985
5125569
5100087
5465845
5392589
5562213
5054646
5142840
5267711
4581874
4769125 4735340
4769126
4953664
5066146
5310102 5363965
5305579
5293731
5421512
5435458
5417039
5564255
5525786
5524763
5501394
5584402
5618147
5702339
5657619 5655352
5642606 5694746 5685432
5947882
5242529
5279696 53219305335485
5526631
5518578
5577615
5551219
5636925
5131499
51673015142841
5242701
5255494
5269416
5368165
5377570
5452559
5511360
5507586
5174413
5050825
4466233
4511023
4688370
4867339
4830317 4840335
4805857
4921197
4957252 4923080
5074674
5020750
4486187 4432186
4454706
4576316
4581007
4316355
4392760
4516384
4664347 4699607
4704845
4819898
4338761
4378182
4413468
4437634
4750694 4723742
4869045
48631254869447
4860902
5012994
4989395
5083702
4470627
4244159
4351424
4416624
4551966
4771587
4858862
4887411
4360999
4325470
4372433
4583349
4946041
5062259
5105605
5174449
5117617 5174094
4484662
4697771
4926624
5005457
4418835
4541228
4657528
4749158
4342564
4322932
4407474
4597749
4620681 4576310
4623111
4613104
4603535
4731981 4758214
4762297
4925439
4341054
4316353
4529090
4635295
4662864
4838504
4277936
44873884480750
4451249
4558800
4579307
4798042
4986872
4364534
4398689
4478029
4676205 4693701
4 695020
4747815
1989
1994
4487599
4490131
4181069
4280315
4357139
4571227
4642084
1988
1993
4295904
4549877
4516380
4148173
4236694
4370845
4473431
4518378
4308021
4356955
4137958 4156521
4226072
4395252
1986
1992
3889449
3973376
1977
1991
3717281
3789570
3893382
1976
1990
3384287
3468100
1970
1987
3323155
3382010
1969
1985
3056632
15328772805898
3156273
5222840 5178196
5361570
5451125
5570977
5517800
4990034
5122016
5299890
5280465
5408814
5503505
4730955
4721419
I also add functions to color nodes and edges
by different patent characteristics, e.g.
– Patent
Technology
Category
and 4-digit
Coloring
nodes
by(2-criteria
HJT)
– Patent Originality/ Generality Index
– Total Number of Citations
GraphPlot3D[PatentNeighborhood[
3858382, 7]]
GraphPlot[PatentNeighborhood[3858382,12]]
Colored by technology category
Time Constrained
The 7-Generation Neighborhood of #3858382,
Colored by Technology Class
Network Statistics and Structure
Analysis
•
•
•
•
•
Citation Lags
Network Curvature
Citation Count Distributions
HJT Technology Categories
Originality and Generality
Distributions of Backward Lags
Distributions of Backward Lags
Network Curvature
the average number of patents reached at subsequent network distances
-some simple graphs and their respective curvature plots-
Network Curvature
the average number of patents reached at subsequent network distances
A much larger network of 91,000 patents
over 40 years
Curvature graphs for each year
Curvature graphs for each year, all together
Curvature graphs for each year, all together, different view
Patent Technological Composition
HJT Category
1
1
1
1
1
1
2
2
2
2
2
3
3
3
3
4
4
4
4
4
4
4
5
5
5
5
5
5
6
6
6
6
6
6
6
6
6
HJT SubCategory
11
12
13
14
15
19
21
22
23
24
25
31
32
33
39
41
42
43
44
45
46
49
51
52
53
54
55
59
61
62
63
64
65
66
67
68
69
SubCategory Name
Agriculture, Food, Textiles
Coating
Gas
Organic Compounds
Resins
Miscellaneous chemical
Communications
Computer Hardware & Software
Computer Peripherials
Information Storage
Unknown
Drugs
Surgery & Med Inst.
Biotechnology
Miscellaneous Drgs & Med
Electrical Devices
Electrical Lighting
Measuring & Testing
Nuclear & X rays
Power Systems
Semiconductor Devices
Miscellaneous Elec
Mat. Proc & Handling
Metal Working
Motors & Engines
Parts
Optics
Transportation
Miscellaneous Mechanical
Agriculture, Husbandry, Food
Amusement Devices
Apparel & Textile
Earth Working & Wells
Furniture, House Fixtures
Heating
Pipes & Joints
Receptacles
Miscellaneous Others
Category Name
Chemical
Chemical
Chemical
Chemical
Chemical
Chemical
Computers & Communications
Computers & Communications
Computers & Communications
Computers & Communications
Unknown
Drugs & Medical
Drugs & Medical
Drugs & Medical
Drugs & Medical
Electrical & Electronic
Electrical & Electronic
Electrical & Electronic
Electrical & Electronic
Electrical & Electronic
Electrical & Electronic
Electrical & Electronic
Mechanical
Mechanical
Mechanical
Mechanical
Mechanical
Mechanical
Others
Others
Others
Others
Others
Others
Others
Others
Others
Total Patents
31 781
64 563
23 269
132 904
118 687
411 881
167 787
119 478
41 154
70 164
2548
114 011
106 104
31 551
26 336
133 152
65 907
110 696
51 408
139 427
80 028
93 005
259 561
135 812
179 208
82 085
152 761
259 434
110 294
53 528
98 543
71 491
125 006
65 573
47 698
108 177
438 506
Frequency
0.735
1.493
0.538
3.074
2.745
9.527
3.881
2.763
0.952
1.623
0.059
2.637
2.454
0.73
0.609
3.08
1.524
2.56
1.189
3.225
1.851
2.151
6.003
3.141
4.145
1.899
3.533
6.001
2.551
1.238
2.279
1.654
2.891
1.517
1.103
2.502
10.142
HJT Technology Category Distribution
Distribution
of 2 digit tech categories
0.08
0.06
0.04
0.02
11
12
13
14
15
19
21
22
23
24
25
31
32
33
39
41
42
43
44
45
46
49
51
52
53
54
55
59
61
62
63
64
65
66
67
68
69
Cumulative distribution of
patents by tech category
Citation Count Distributions
Citation Count Distributions
Citation Count Distributions - continued
Citation Count Distributions - continued
Citation Count Distributions - continued
Generality and Originality
• where J is the number of patent classes, Ni is the total number of
forward citations for patent i, and Ni,j is the number of forward
citations in each patent class for patent i. The second term is a
Herfindal-type of index.
• The 'Originality' of Patent 'i' is the same, except with backwards
citations (i.e. citations made).
• "Thus if a patent cites previous patents that belong to a narrow
set of technologies, the originality score will be low, whereas
Citing patents in a wide range of fields would render a higher
score."
Generality and Originality - Continued
Not very interesting
- at least no trends over time –
and seemingly no necessary
relationship to the concepts
they intend to measure.
Conclusions
• Mathematica is a nice platform for
networks analysis
• There is a lot of opportunity for research in
this area
• Don’t know what the value of this research
is to the IPI-ConfEx clientele
References
• [1] B. Hall, Jaffe, Trajtenberg, "The NBER
Patent Citations Data File: Lessons,
Insights and Methodological Tools," 2002,
http://emlab.berkeley.edu/users/bhhall/pat/
NBERpatdata.pdf
• [2] S. Wolfram, A New Kind of Science, :
2002