Implementing Stack and Queue Data Structures with SAS

Download Report

Transcript Implementing Stack and Queue Data Structures with SAS

Visualizing Two Social Networks
Across Time with SAS®:
Collaborators on a Research
Grant vs. Those Posting on SAS-L
Larry Hoyle
Institute for Policy and Social Research
University of Kansas
SGF2009 paper 229, Larry Hoyle
1
Visualize These Data
Links
Nodes
SGF2009 paper 229, Larry Hoyle
2
A Social Network
SGF2009 paper 229, Larry Hoyle
3
Constellation Chart: Nodes
Nodes Have:
Size (age)
Color(gender)
Tip (text)
SGF2009 paper 229, Larry Hoyle
4
Constellation Chart Links
Links Have:
Width (Hours)
Color(family)
Tip (text)
SGF2009 paper 229, Larry Hoyle
5
Social Network Graph
Two SAS tools:
•Constellation Chart Applet (and Macro)
•Annotate File
SGF2009 paper 229, Larry Hoyle
6
Constellation Chart Slider
Slider set to
show only links
19
with
or
more hours
spent together
SGF2009 paper 229, Larry Hoyle
7
Constellation Chart Slider
Slider set to
show only links
14
with
or
more hours
spent together
SGF2009 paper 229, Larry Hoyle
8
Constellation Code
title 'Mean Hours Spent Together';
%ds2const(
ndata=Flints, ldata=FlintTimes, datatype=assoc,
minlnkwt=30,
height=360,
codebase=&jarpath,
colormap=y,
fntsize=12,
nid=Person,
ncolor=gender,
nlabel=Person,
ncolfmt=Gcolor.,
lfrom=PersonFrom, lto=PersonTo,
linktype=line,
lcolor=linktype,
ltip=ltip,
sclnkwt=N);
SGF2009 paper 229, Larry Hoyle
Files
width=480,
htmlfile=&outfile,
Appearance
nvalue=age,
ntip=ntip,
Nodes
lvalue=MeanHours,
lcolfmt=Lcolor.,
Links
9
Two Different Sets of Data
Each With Their Own Challenges
• SAS-L (the SAS Listserv)
– Nodes are email addresses of posts (23,827)
– Links are posts to the same thread in the same year
(267,209 messages to 82,279 threads ).
• Kansas NSF EPSCoR Grant
– Nodes are projects and nodes are people
• People have different roles (PI, researcher, support staff)
– Multiple types of links, together on:
• authorship, proposals, listed together in narrative
– Changes across time
SGF2009 paper 229, Larry Hoyle
10
SAS-L Data – Available on the Web
Data Cleaning –
Addresses Change
Linkedposting to
the same
thread
SGF2009 paper 229, Larry Hoyle
11
SAS-L - Too Many Nodes for Applet
Approach: Limit the number of nodes
SGF2009 paper 229, Larry Hoyle
12
SAS-L Those With Over 100 Posts
SGF2009 paper 229, Larry Hoyle
13
Most Links are With a Core Group
SGF2009 paper 229, Larry Hoyle
14
Too Many Nodes for Applet
Approach: Display All w/ SAS Annotate File
A ll S A S - L P e o p le P o s t in g t o S A S - L s in c e 1 9 9 6 , > 5 c o m m o n p o s t s
Th o s e w ith m o r e th a n 1 00 p o s ts to ta l a r e r e d , o th e r s a r e gr a y
L in k s w ith a to p p o s te r a r e b lu e , o th e r lin k s a r e b la c k
SGF2009 paper 229, Larry Hoyle
15
SAS Annotate File – Arrange Nodes
A ll S A S - L P e o p le P o s t in g t o S A S - L s in c e 1 9 9 6 , > 5 c o m m o n p o s t s
Th o s e w ith m o r e th a n 1 00 p o s ts to ta l a r e r e d , o th e r s a r e gr a y
L in k s w ith a to p p o s te r a r e b lu e , o th e r lin k s a r e b la c k
•How do you arrange the nodes in
some meaningful way?
•All Nodes Around a Circle or
•Multidimensional Scaling of
some or all nodes
proc mds
data=SGF2009.TOPPOSTERSSIMILARITY
out=SGF2009.TopPosters2D
similar
dimension = 2
level=ordinal;
run;
SGF2009 paper 229, Larry Hoyle
16
Problem: MDS on 23K nodes?
l S A S - L P e o p le P o s t in g t o S A S - L s in c e 1 9 9 6 , > 5 c o m m o n p o s t s
A ll S A S - L P e o p le P o s t in g t o S A S - L s in c e 1 9 9 6 , > 5 c o m m o n p o s t s
Th o s e w ith m o r e th a n 1 00 p o s ts to ta l a r e r e d , o th e r s a r e gr a y
L in k s w ith a to p p o s te r a r e b lu e , o th e r lin k s a r e b la c k
Th o s e w ith m o r e th a n 1 00 p o s ts to ta l a r e r e d , o th e r s a r e gr a y
L in k s w ith a to p p o s te r a r e b lu e , o th e r lin k s a r e b la c k
•Scale the nodes
with the most links
(shown in red)
•Arrange the others
randomly in a circle
around them
(shown in gray)
•Links to red nodes in
blue, others in black
SGF2009 paper 229, Larry Hoyle
17
Zoom and Pan With Applet
With annotate – Vector output (E.G.) RTF
would allow zoom, but not tip on links
SGF2009 paper 229, Larry Hoyle
18
3D with PROC G3D and Annotate
ActiveX and Java Devices Only
SGF2009 paper 229, Larry Hoyle
19
3D with PROC G3D and Annotate
Generated in SAS 9.2
SGF2009 paper 229, Larry Hoyle
20
3D with PROC G3D and Annotate
Generated From EG 4.1
SGF2009 paper 229, Larry Hoyle
21
3D with PROC G3D and Annotate
ActiveX and Java Devices Only
SGF2009 paper 229, Larry Hoyle
22
Kansas NSF EPSCoR Phase V
Visualization Needs
•
•
•
•
•
•
•
Show relationships among 247 people
And among 50 projects
Show change in collaboration across time
Differentiate core people
Differentiate principal investigators (Pis)
Differentiate institutions
Animate across time
SGF2009 paper 229, Larry Hoyle
23
Projects Layer Arranged by
People in Common Across all Years
P ro je c ts arran g e d b y n u m b e r o f p e o p le in c o m m o n
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
P r o je c t s w it h m o r e p e o p le in c o m m o n a p p e a r c lo s e r t o g e t h e r
L in e w id th p r o p o r tio n a l to n u m b e r o f p e o p le in c o m m o n
SGF2009 paper 229, Larry Hoyle
24
Core People Layer Arranged by
Centroid of Projects to Which They Belong
K N E C o re P e o p le
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
P r o je c t s w it h m o r e p e o p le in c o m m o n a p p e a r c lo s e r t o g e t h e r
L a r g e ta n d o t in d ic a te s in itia l c o r e p e o p le .
SGF2009 paper 229, Larry Hoyle
25
People and Links
5 10 2
5 114
5 117
5 112
5 113
•People
•Color indicates institution
•White dot is Principal Investigator
•Size is count (e.g. publications)
•Large tan dot indicates core person
•Links
•Width represents count in common
5 10 3
5 12 3
5 10 5
SGF2009 paper 229, Larry Hoyle
26
People in Fixed Positions
Allows Animation Across Time (2006)
N e tw o r k o f K N E P e o p le w ith P r o p o s a ls o r S c ie n tific P r o d u c ts in C o m m o n
P r o p o s a ls A w a r d e d a n d S c ie n t ific P r o d u c t s T h r o u g h 2 006
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l c o lla b o r a tio n s , L in e w id th s ig n ifie s n u m b e r o f c o lla b o r a tio n s . L a r g e ta n d o t in d ic a te s in itia l
c o r e p e o p le . A c ir c le w ith in th e d o t in d ic a te s a P I. P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
27
People in Fixed Positions
Allows Animation Across Time (2007)
N e tw o r k o f K N E P e o p le w ith P r o p o s a ls o r S c ie n tific P r o d u c ts in C o m m o n
P r o p o s a ls A w a r d e d a n d S c ie n t ific P r o d u c t s T h r o u g h 2 007
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l c o lla b o r a tio n s , L in e w id th s ig n ifie s n u m b e r o f c o lla b o r a tio n s . L a r g e ta n d o t in d ic a te s in itia l
c o r e p e o p le . A c ir c le w ith in th e d o t in d ic a te s a P I. P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
28
People in Fixed Positions
Allows Animation Across Time (2008)
N e tw o r k o f K N E P e o p le w ith P r o p o s a ls o r S c ie n tific P r o d u c ts in C o m m o n
P r o p o s a ls A w a r d e d a n d S c ie n t ific P r o d u c t s T h r o u g h 2 008
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l c o lla b o r a tio n s , L in e w id th s ig n ifie s n u m b e r o f c o lla b o r a tio n s . L a r g e ta n d o t in d ic a te s in itia l
c o r e p e o p le . A c ir c le w ith in th e d o t in d ic a te s a P I. P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
29
Other Comparisons –
All Proposals and Submissions
N e tw o r k o f K N E P e o p le w ith P r o p o s a ls o r S c ie n tific P r o d u c ts in C o m m o n
A ll P e o p le in v o lv e d in K N E , a n d a ll S u b m is s io n s
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l c o lla b o r a tio n s , L in e w id th s ig n ifie s n u m b e r o f c o lla b o r a tio n s . L a r g e ta n d o t in d ic a te s in itia l
c o r e p e o p le . A c ir c le w ith in th e d o t in d ic a te s a P I. P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
30
Other Comparisons –
Successful Proposals
N e tw o r k o f K N E P e o p le w ith P r o p o s a ls o r S c ie n tific P r o d u c ts in C o m m o n
P r o p o s a ls A w a r d e d a n d S c ie n t ific P r o d u c t s T h r o u g h 2 008
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l c o lla b o r a tio n s , L in e w id th s ig n ifie s n u m b e r o f c o lla b o r a tio n s . L a r g e ta n d o t in d ic a te s in itia l
c o r e p e o p le . A c ir c le w ith in th e d o t in d ic a te s a P I. P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
31
Other Comparisons –
Proposals
N e tw o rk o f K N E P e o p le w ith P ro p o s als in C o m m o n
P ro p o s a ls A w a rd e d T h ro u g h 2 0 0 8
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l p r o p o s a ls , L in e w id th s ig n ifie s p r o p o s a ls in c o m m o n . L a r g e ta n d o t in d ic a te s in itia l c o r e p e o p le .
A c ir c le w ith in th e d o t in d ic a te s a P I. P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
32
Other Comparisons –
Scientific Product
N etw ork of K N E P eop le w ith S c ientific P rod u c ts in C om m on
S c ie n tific P ro d u c ts T h ro u g h 2 0 0 8
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l s c ie n tific p r o d u c ts , L in e w id th s ig n ifie s s c ie n tific p r o d u c ts in c o m m o n . L a r g e ta n d o t
in d ic a te s in itia l c o r e p e o p le . P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
33
Other Comparisons –
Combined
N e tw o r k o f K N E P e o p le w ith P r o p o s a ls o r S c ie n tific P r o d u c ts in C o m m o n
P r o p o s a ls A w a r d e d a n d S c ie n t ific P r o d u c t s T h r o u g h 2 008
5505
5 118
5 110
5507
5 10 8
5 12 7
5 10 2
5501
5 10 7
5 116
5 12 9
5 12 8
5300
5 10 4
5 114
5 12 1
5 117
5403
5 12 2
5 112
5 113
5 10 3
5 12 3
5404
5400
5503
5301
5 10 5
5202
5201
5 13 1
5 10 9
5 119
5 10 6
5 10 1
5 12 4
5504 5000
5401
5 12 6
5500
5 12 0
5 10 0
5 12 5
5 111
5 115
5502
5506
5 13 0
5402
D o t s iz e in d ic a te s # o f in d iv id u a l c o lla b o r a tio n s , L in e w id th s ig n ifie s n u m b e r o f c o lla b o r a tio n s . L a r g e ta n d o t in d ic a te s in itia l
c o r e p e o p le . A c ir c le w ith in th e d o t in d ic a te s a P I. P e o p le a r r a n g e d in p r o x im ity to p r o je c ts w ith w h ic h th e y a r e a s s o c ia te d .
SGF2009 paper 229, Larry Hoyle
34
Method Comparisons
• Applet
–
–
–
–
–
–
–
• Annotate
Coding is Quick
Slider
Link Tips
Memory Limits
Screen Capture to Publish
Dynamic Pan and Zoom
Data Driven Color and Size
–
–
–
–
–
–
–
Additional Data Steps
Animated GIF
HTML Link Tips (Difficult)
Many Nodes Possible
High Quality Reproduction
No Tips (ODS Vector Output)
Richer Symbology
SGF2009 paper 229, Larry Hoyle
35
Animation Issues – Fix Node Position
Fix the position of
nodes across all
frames
– Arrange in circle
– Dimension
reduction (MDS?)
– Example:
KNEGIF.htm
SGF2009 paper 229, Larry Hoyle
36
Animation Issues - Interpolation
Dimension reduction that
preserves orientation then interpolate
between observations
• SAS Example:
could do something like
Kansas Data Archive
Bubble Plots
Chart from http://www.ipsr.ku.edu/ksdata/
Inspired by Trendalyzer Software http://www.gapminder.org
SGF2009 paper 229, Larry Hoyle
37
Other Tools
• SAS Graph NV Workshop
• Enterprise Miner
– See paper 109-2009
Barry de Ville, Discover and Drive Brand Activity
in Social Networks
SGF2009 paper 229, Larry Hoyle
38
Statistics - Clustering
• Clustering Coefficient
– Global
– Proportion of triads that have third link
A
When BA and BC
are present,
Is AC present?
?
B
C
SGF2009 paper 229, Larry Hoyle
39
Statistics - Betweenness
• Betweenness Centrality
– Individual
– Sum of proportion of shortest paths that go
through a given link
w
x
v
z
y
Contributing to Centrality for v –
wvz and wxz – v is central 1 of 2 shortest w-z paths
SGF2009 paper 229, Larry Hoyle
40
Statistics - Betweenness
• Betweenness Centrality
– Individual
– Sum of proportion of shortest paths that go
through a given link
x
w
v
z
y
Contributing to Centrality for v –
wvz and wxz – v is central in 1 of 2 shortest w-z paths
wvy - v is central in 1 of 1 shortest w-y paths
SGF2009 paper 229, Larry Hoyle
41
Statistics - Betweenness
• Betweenness Centrality
– Individual
– Sum of proportion of shortest paths that go
through a given link
x
w
v
z
y
Contributing to Centrality for v –
wvz and wxz – v is central in 1 of 2 shortest w-z paths
wvy - v is central in 1 of 1 shortest w-y paths
wx – v is central in 0 of 1 shortest w-paths
SGF2009 paper 229, Larry Hoyle
42
Questions?
Larry Hoyle
[email protected]
A ll S A S - L P e o p le P o s t in g t o S A S - L s in c e 1 9 9 6 , > 5 c o m m o n p o s t s
Th o s e w ith m o r e th a n 1 00 p o s ts to ta l a r e r e d , o th e r s a r e gr a y
L in k s w ith a to p p o s te r a r e b lu e , o th e r lin k s a r e b la c k
SGF2009 paper 229, Larry Hoyle
43