Transcript Slide 1

SOC 206
• INTRODUCTION TO
• NETWORK ANALYSIS I
U.S. interstate highway system
The switch yard at New
York's Niagara Project
became useless metalwork
when a blackout struck the
eastern United States
http://www.sfgate.com/cgi-bin/object/article?f=/c/a/2003/08/15/MN191082.DTL&o=0
Network of physical
interactions between
nuclear proteins [...]
consisting of all proteins
that are known to be
localized in the yeast
nucleus [...], and which
interact with at least one
other protein in the
nucleus.
This subset consists of 318
interactions between 329
proteins.
Note that most neighbors
of highly connected nodes
have rather low
connectivity.
Maslov and Sneppen 2002
http://arxiv.org/abs/condmat/0205380
Protein networks in yeast
Spread of TB
Black nodes are persons with clinical disease (and are
potentially infectious), pink nodes represent exposed persons
with incubating (or dormant) infection and are not infectious,
green represent exposed persons with no infection and are not
infectious. The infection status is unknown for the grey nodes.
Unfortunately the 'social butterfly' in this community, the
black node in the center of the graph, is also the most
infectious -- a super spreader.
http://www.orgnet.com/contagion.html
Scientific collaboration
The largest component of the Santa Fe
Institute collaboration network, with the
primary divisions detected by our algorithm
indicated by different vertex shapes.
http://www.pnas.org/content/99/12/7821.f
ull.pdf+html
The Internet
http://www.lumeta.com/research/
http://iserp.columbia.edu/files/iserp/2002_04.pdf also as Berman et al. Chains of Affection, AJS 2004
Management hierarchy of a major corporation
and decision-making conversations
What do the decision-making links reveal about this
organization?
Some advice flows along formal ties [within the
hierarchy], while other advice flows along informal
ties [outside of the hierarchy].
 There is strong triangle of input and feedback
amongst Directors 2 and 3 and the General
Manager. These strong, trusting ties have grown
and solidified over many years of working together.
 Director 1 is new to the organization. Manager
12 was hoping to get this position, but Corporate
strongly pushed for Director 1. Notice that Manager
12 is still locally influential in the decision-making
network. Director 1 does not include input from
direct reports in decision-making [ remember A -->
B means that A seeks out B ] !
 Director 4 is about to retire. He used to run this
division when it was much smaller. Unlike Director
1, Director 4 does include inputs from his staff.
 The decision-making patterns in the departments
of Directors 2 and 3 are quite different from the
pattern of links in the departments of Directors 1
and 4. Directors 2 and 3 seek information from all
levels of the organization -- their departments show
both vertical and horizontal flows. Several
managers in these departments [23, 24, 34, and
35] are boundary spanners -- connecting to others
outside of their immediate group. Departments 2
and 3 are an example of participatory decisionmaking -- including inputs from up and down the
hierarchy, as well as inside and outside the
department.
 Who do you see as the most influential person[s]
in shaping decisions in this organization?
http://www.orgnet.com/decisions.html
Interlocking Directorates in the Corporate Community
http://sociology.ucsc.edu/whorulesamerica/power/corporate_community.html
Political books purchased in August 2008 at Amazon.com
http://www.orgnet.com/divided.html
Based on the pattern of connections between the books in the map
above, the most influential political books at the end of the summer 2008
are: What Happened (White House spokesman Scott McClellan’s tell-all)
and The Post American World (Fareed Zakaria’s book on the rise of
regional powers -- neither addressed the ongoing election.
Mark Lombardi: Global (Conspiracy) Networks
Social Network
Analysis of the 9-11
Terrorist Network
http://www.orgnet.com/hijackers.html
http://socialsim.wordpress.com/2007/03/01/another-fabulous-network-image-academy-award-thanks/
World Trade in 1981 and 1992
Lothar Krempel. The structure of world trade of between 28 OECD countries in 1981 and 1992. The size of the nodes gives the volume of flows in dollars
(imports and exports) for each country . The size of the links stands for the volume of trade between any two countries. Colors give respectively the regional
memberships in different trade organisations: EC countries (yellow), EFTA countries (green), USA and Canada (blue), Japan (red), East Asian Countries (pink),
Oceania (Australia , New Zealand) (black).
http://www.mpi-fg-koeln.mpg.de/~lk/netvis/trade/WorldTrade.html
Social Network Analysis
• A network is a set of objects/nodes and a set of
connections/ties between them
• In a social network, nodes can be people, groups,
organizations, countries, physical or cultural
objects created or used by people, people’s
thoughts or activities, etc. just about anything
• Explanations based on network ties are usually
categorized as network approach or structural
approach
• SNA is mathematical, but not necessarily
statistical
• SNA is not just about methods, it’s theory too
Barbasi: Scale-Free Networks
•
•
•
•
Scale free networks obey the power law
The power law posits that the distribution of links in a network follows a
highly skewed distribution where a few has a great number of ties while the
rest have few
In a scale free network new nodes form
by preferential attachment
–
–
–
–
•
On the internet, the number of hyperlinks follow the power law.
–
•
i.e. those with more connections will get even more
This is also known as the Matthew principle (Merton)
(see also “rich gets richer,” “cumulative advantage,”
“increasing returns to scale,” network externalities”)
The more a site is linked the more new links it will attract. Hence you have a few giga sites like
Google and millions of sites with only a few links to it.
Some other examples:
–
–
–
–
Protein-to-protein interaction networks
Sexual partners in humans
Scientific citation networks
Semantic networks
Small World Studies
•
•
•
Milgram (1967) gave a letter to people in
Nebraska and Kansas to get it to a person in
Massachusetts they did not know through
personal acquaintances. The average number
of steps was 5 the maximum 12 and 25% of
the letters arrived.
Suppose each person knows 100 people
(including the superficial acquaintances). Each
person has 100 degrees.
Suppose there is no clustering. This person
will have access to 100*100= 10 000 people in
the second step. 100*100*100= 1 million in
the third. 1004= 100 million in the fourth and
10 billion in the fifth.
•
Example of no clustering if everyone has only
4 friends (the average degree is 4 with a
variance of 0)
http://smallworld.columbia.edu/
Six Degrees of Kevin Bacon
or
Who is the Center of the Hollywood Universe?
• 800,000 people in the
Internet Movie Database
• Kevin Bacon’s number
(average chain length) is
2.946
• Sean Connery’s number is 2
• Charlie Chaplin’s number is
3
• Jean-Luc Godard’s 2
• My father-in-law (Janos
Hersko) has number 3
• Average for all 800,000
people is 9.200
•
http://oracleofbacon.org/center.html
Bacon
Number
# of
people
0
1
1
1806
2
145024
3
395126
4
95497
5
7451
6
933
7
106
8
13
Vertex= nodes/objects, edge=ties Newman 2003
Transitivity
C
A
C
B
Forbidden Triad
A
B
Granovetter: The Strength of Weak Ties
•
Burt: Structural
Holes
•
“The BEFORE network contains 5
primary contacts and reaches a total
of 15 people. However, there are only
two nonredundant contacts in the
network. Contacts 2 and 3 are
redundant in the sense of being
connected with each other and
reaching the same people. The same is
true for contacts 4 and 5. Contact 1 is
not connected directly to contact 2,
but he reaches the same secondary
contacts; thus contacts 1 and 2
provide redundant network benefits.
Illustrating the other extreme,
contacts 3 and 5 are connected
directly, but they are nonredundant
because they reach separate clusters
of secondary contacts.
In the AFTER network, contact 2 is
used to reach the first cluster in the
BEFORE network, contact 4 is used to
reach the second cluster. The time
and energy saved by withdrawing from
relations with the other three primary
contacts is reallocated to primary
contacts in new clusters. The BEFORE
and AFTER networks are both
maintained at a cost of fice primary
relationships, but the AFTER network
is dramatically richer in structural
holes, and so benefits." (Burt,
Structural Holes pp.22-3.
Robust Action and the Rise of the Medici
•
•
•
Padgett and Ansell argue that
the Medicis were powerful
because they could use their
membership in overlapping
social networks strategically.
Power comes from not being
locked into a single network
or identity but to cultivate
ambiguity by belonging to
many networks. Multiple
networks also deliver more
resources. Both ambiguity
and multiple resources lead to
more discretion and power.
They could also attain central
position where others had to
communicate through them.
Putnam: Bowling Alone
Social capital is declining:
•
Political participation is declining
•
Participation in religious groups is declining:
•
Labor union membership is declining
•
Participation in voluntary organizations is
declining
•
Family ties are looser
•
Less contact with neighbors
Wellman: Cyberplace
• Larger volume and higher speed of information transfer
• Portability of wireless technology
• Globalized connectivity
• Personalization
• Networked individualism
Cognitive Maps (Carley and Palmquist 1992)
The figure is a graphic illustration of the complete map extracted from the complete
interview with a student at the beginning of the term […]. All concepts in the map are
listed in a circle. The relationship between two concepts is denoted by a line. This map
represents the student's conception of research writing at the beginning of the term,
and it illustrates that those concepts about which the student has the most information
at the beginning are fact, research, topic, and writing. Tracing through some of the
relationships (represented by lines) between concepts reveals that in the student's
view, at the beginning of the term, writing a paper involves having an opinion that is
based on fact which can be found through research.
This figure is a graphic illustration of the map extracted from an interview with the
same student later in the term. This interview shows the student's conception of
research writing at the end of the term. A comparison of Figure 4 and Figure 5 shows
that the student's conception has shifted over time. For example, many of the concepts
used by the student to describe research writing have changed and, for those
concepts that are retained, their relative semantic importance may have changed
(more important, more relationships, more lines). From the beginning to the end of the
term, in the students mental model of research writing, the concept information has
grown in importance (more lines in Figure 5 than Figure 4) but the concept outline has
decreased in importance to the extent that it does not even appear in the later map.
Once again tracing through some of the relationships between concepts reveals that in
the student's view, at the end of the term, writing a paper involves having information
that depends on facts and a plan that is original and guides research.
Network Data
•Types of Questions asked:
– Structural variables – questions about ties/connections
– Compositional variables – questions about
characteristics/attributes of the nodes/actors
•When analyzing networks researchers often do not
have representative samples of individuals
•Often creates human subject concerns
•Specifying boundaries of the population to study:
– Nominalist approach – actors themselves decide on
membership in a network answering name generating
questions
– Realist approach – a list is constructed by a researcher
based on theoretical concerns
Snowball Sampling
Types of Network Data:
One-Mode Network
• Actor-to-actor network
• Actor attributes – characteristics of an actor
• Actors: people, groups, organizations, communities,
nations
• Relations: interactions, transfer of resources or
information, movement, formal roles, kinship
–
E.g. Network data representing friendships among students in a high
school
This is a simple one-mode network data, where the relationships are
binary (yes/no) and asymmetric, and we know one attribute about
everyone (race)
Student
1 (W)
Student
2 (H)
Student
3 (B)
X
1
1
1
X
0
1
0
X
………….
…………
….
…………
….
…………
….
Student
N (W)
1
0
1
Student
1 (W)
Student
2 (H)
Student
3 (B)
…………
….
…………
….
…………
….
…………
….
…………
….
…………
….
Student
N (W)
1
0
1
…………
….
X
High school friendship (color is for race)
http://www.soc.washington.edu/users/stovel/Chains.pdf
Types of Network Data:
Dyadic Two-Mode (Bipartite) Network
Lab1
Lab2
• Two sets of actors
with connections
only between the
sets
Corporations
Nonprofits
High school dating
Girl 1
Girl 2
Girl 3
………….
Girl N
http://www.soc.washington.edu/users/stovel/Chains.pdf
Boy 1
Boy 2
Boy 3
…………….
Boy M
0
0
1
…………….
0
1
0
0
…………….
0
1
0
0
…………….
1
…………….
…………….
…………….
…………….
…………….
1
0
1
…………….
0
Types of Network Data:
Two-Mode, Affiliation Network
• Actor-to-event network
• Actors are the first mode, events are the
second mode
• Events are activities or groups that actors
may participate in or be affiliated with
• Events: social functions, clubs, voluntary
organizations, agreements and treaties for
countries, etc.
• Attributes are recorded for both actors and
events
Breiger: Duality of Persons and Groups
McPherson: Hypernetwork Sampling
Sample of Individuals
Sample of Organizations
Org 1
Org 2
Org 3
Org 4
Org 5
Org 6
Org 7
Person 1
0
0
0
0
0
0
0
Person 2
0
0
0
0
0
0
0
Person 3
0
0
0
0
0
0
0
Person 4
0
0
0
0
0
0
0
Person 5
0
0
0
1
0
0
0
Person 6
0
1
0
0
0
0
1
Person 7
0
0
0
0
0
0
0
Person 8
0
0
0
0
1
0
0
Person 9
0
0
0
0
0
0
0
Person 10
0
0
0
0
0
0
0
Types of Network Data:
Ego-Centered Network (GSS 1985)
• Also called personal network
• Centered on respondent
• Often used in surveys with
representative samples
• Ego – the focal actor
• Alter – actors tied to an ego
• Attributes are recorded for
both ego and alters
• Information on alters’ contacts
with each other can be
collected
– In most surveys, Ego’s
connections to alters is ignored
Quantifying Relationships
• Direction:
– directed vs. symmetric (reciprocal) ties
• Level of measurement:
– dichotomous vs. valued data
• Sign
– positive vs. negative
Question Formats: Roster vs. Free Recall
Q1. This is a list of students taking Sociology
101 with you. Please circle your own name.
Please also indicate with an X with whom of
these people you interact outside of class.
Q2. Please think of up to three people you
usually go to for an advice about your life and
answer a few questions about them.
Question Formats: Free vs. Fixed Choice
Q1. Please think of the people you
usually go to for an advice about your
life. Please write down their initials and
answer a few questions about them.
Q2. Please think of up to three people
you usually go to for advice about your
life and answer a few questions about
them.
Question Formats:
Ratings vs. Complete Rankings
Q1. On a scale from 1 to 5 where 1 is
“not important at all” and 5 is “very
important,” please tell us how
important this person is to you
Q2. Please rank these persons in the
order of importance to you with person
number one being the most important.
Summarizing Network Data
•
•
•
•
•
•
Actor
Dyad
Triad
Subgroup
Set of actors
Entire network
Questionnaire
Questions about graduate students and relations related to the studies:
• How frequently do you talk to each person on this list (in person or on the
phone)?
• With whom do you hang out discussing or debating sociological ideas?
• Who do you go to when you need help with your class work, paper,
presentation or research?
• Who do you go to for advice or information on matters related to your studies
(for example, who to choose as your advisor, which classes to take, which
conference to attend, etc.)
• Who have you collaborated with on a class project, paper, conference
presentation, or writing an article?
• Who do you go to when you face a stressful situation related to your
graduate studies and want to talk to someone about it?
• If you receive good news related to your studies or professional career, who
do you tell it to first?
Questions about graduate students and relations beyond graduate studies:
• Who do you go to for help when you need a $10 loan, a ride to a doctor, etc.?
• Who do you go to when you face a stressful situation not related to your
graduate studies and want to talk to someone about it?
• Who do you go to for advice or information when making life decisions not
related to your studies?
• Who do you usually hang out with outside of the department socially (for
example, visit each other for dinner or go to concerts, clubs, or parties
together, etc.)?
Questionnaire
Questions about discussion/reading groups and
classes:
• Which of the following discussion/reading groups
are you a regular participant of?
• Which of the following classes have you taken this
academic year?
Questions about faculty:
• Which faculty members have you been in contact
with beyond your class work this academic year
(your worked for them as TA or RA, you joined their
project, they are on your dissertation committee,
etc.)
Questionnaire
Personal network questions:
•
Please write down initials of up to three people who you consider to be the
most important people in your life and answer a few questions about them:
•
What is this person’s gender?
•
What is this person’s age?
•
How is this person related to you? (Please check all that apply)
•
Does this person live in San Diego?
•
How frequently do you see these people in person?
•
How often to you talk to this person on the phone?
•
How frequently to you email or text-message this person?
•
Do you go to this person for help when you need a loan, a ride to a doctor,
etc.?
•
Do you go to this person when you face a stressful situation and want to
talk to someone about it?
•
Do you go to this person for advice or information when making life
decisions?
•
Do you spend your leisure time with this person (for example, visit each
other for dinner or go to concerts, clubs, and parties together, etc.)?
•
Who of the named people know each other, meet and talk to each other
even when you are not around?
Questionnaire
• Questions about the program
• Questions about satisfaction
with the graduate school
experience
• Demographic questions
Academic Advice Network
Student by Faculty Network
Student by Courses Network