Part 1: Analyzing network data with igraph

Load packages: library(igraph) library(igraphdata) library(data.table) library(tidyverse)

Network data

Working with network data in igraph is fairly straightforward. For this tutorial, we’ll start with loading some foodweb networks from the igraphdata package.

data(foodwebs)

Try it

Examine the foodwebs object. It looks like there are multiple networks in this dataset, so first let’s make them into individual network objects.

Click for solution
ChesLower <- foodwebs$ChesLower
ChesMiddle <- foodwebs$ChesMiddle
ChesUpper <- foodwebs$ChesUpper

Let’s check out our igraph object: ChesUpper. What can we learn about the network from this summary? Is the network directed or undirected? What are the edge and vertex attributes?

IGRAPH babc2a0 DNW- 37 215 -- Upper Chesapeake Bay in Summer
+ attr: Citation (g/c), Author (g/c), URL (g/c), name (g/c), name
| (v/c), ECO (v/n), Biomass (v/n), weight (e/n)
+ edges from babc2a0 (vertex names):
 [1] Input      ->Net Phytoplankton Input      ->Picoplankton     
 [3] Input      ->Microphytobenthos Input      ->SAV              
 [5] Input      ->Oysters           Input      ->Blue Crab        
 [7] Input      ->American Eel      Input      ->DOC              
 [9] Input      ->Sediment POC      Input      ->POC              
[11] Oysters    ->Output            Blue Crab  ->Output           
[13] Menhaden   ->Output            Bay anchovy->Output           
+ ... omitted several edges

Node and edge attributes

Let’s examine the node attributes. V(ChesUpper) lists all the vertices in the graph.

+ 37/37 vertices, named, from babc2a0:
 [1] Net Phytoplankton          Picoplankton              
 [3] Free Bacteria              Particle Attached Bacteria
 [5] Heteroflagellates          Ciliates                  
 [7] Rotifers                   Meroplankton              
 [9] Mesozooplankton            Ctenophores               
[11] Chrysaora                  Microphytobenthos         
[13] SAV                        Benthic Bacteria          
[15] Meiofauna                  Deposit Feeding Benthos   
[17] Suspension Feeding Benthos Oysters                   
[19] Blue Crab                  Menhaden                  
+ ... omitted several vertices

V(ChesUpper)$name lists all the name attribute values for the vertices in the graph. Note the difference between the output of these two commands.

 [1] "Net Phytoplankton"          "Picoplankton"              
 [3] "Free Bacteria"              "Particle Attached Bacteria"
 [5] "Heteroflagellates"          "Ciliates"                  
 [7] "Rotifers"                   "Meroplankton"              
 [9] "Mesozooplankton"            "Ctenophores"               
[11] "Chrysaora"                  "Microphytobenthos"         
[13] "SAV"                        "Benthic Bacteria"          
[15] "Meiofauna"                  "Deposit Feeding Benthos"   
[17] "Suspension Feeding Benthos" "Oysters"                   
[19] "Blue Crab"                  "Menhaden"                  
[21] "Bay anchovy"                "Herrings and Shads"        
[23] "White Perch"                "Spot"                      
[25] "Croaker"                    "Hogchoker"                 
[27] "American Eel"               "Catfish"                   
[29] "Striped Bass"               "Bluefish"                  
[31] "Weakfish"                   "DOC"                       
[33] "Sediment POC"               "POC"                       
[35] "Input"                      "Output"                    
[37] "Respiration"               

Try it

How would you view the Biomass attribute? This graph currently has one edge attribute: “weight”. How might you examine this attribute?

Click for solution
V(ChesUpper)$Biomass
 [1]   1356.000    239.000    649.000     36.000     30.000     66.000
 [7]     14.000      3.200    282.000     17.000      0.001    293.000
[13]   2086.000     30.000    700.000   2368.000  27232.000      0.001
[19]    610.000   2136.000    287.000    212.000    282.000    195.000
[25]     50.000    100.000     35.000    450.000    172.000     68.000
[31]     67.000  12504.000 201670.000   5249.000      0.000      0.000
[37]      0.000
E(ChesUpper)$weight
  [1] 9.105056e+04 2.267064e+04 2.156480e+04 1.535296e+03 1.000000e-07
  [6] 1.196000e+01 2.024000e+00 8.800000e+01 2.428800e+04 9.731000e+03
 [11] 1.000000e-07 1.674400e+02 4.416000e+03 3.680000e+02 1.288000e+01
 [16] 1.656000e+01 1.159200e+01 2.208000e+01 1.932000e+01 4.232000e+01
 [21] 0.000000e+00 0.000000e+00 0.000000e+00 2.493200e+04 5.520000e+03
 [26] 4.239268e+04 3.444480e+03 5.987544e+03 8.804400e+03 3.864000e+02
 [31] 1.477520e+02 3.126252e+03 1.107680e+02 0.000000e+00 5.391200e+03
 [36] 0.000000e+00 7.286400e+03 4.508000e+03 4.574976e+03 3.131680e+04
 [41] 0.000000e+00 1.661520e+02 1.218374e+04 3.193688e+03 2.824400e+02
 [46] 7.516400e+01 2.598080e+02 6.660800e+01 7.084000e+02 1.444400e+01
 [51] 6.541200e+01 2.990920e+02 3.245760e+02 2.928360e+02 0.000000e+00
 [56] 0.000000e+00 0.000000e+00 1.518000e+04 9.660000e+02 2.409480e+02
 [61] 1.097818e+04 2.054378e+04 1.768608e+04 4.188760e+02 5.234800e+01
 [66] 5.234800e+01 3.036000e+03 1.288000e+02 4.912800e+01 2.360904e+03
 [71] 5.135992e+03 5.151816e+03 1.288000e+03 2.602680e+04 6.072000e+03
 [76] 7.176000e+01 7.176000e+01 2.054360e+03 2.645000e+02 2.645000e+02
 [81] 6.072000e+03 6.440000e+01 6.951428e+03 6.951428e+03 1.288000e+02
 [86] 1.720400e+02 8.263164e+03 3.841920e+02 3.514400e+01 6.286176e+03
 [91] 6.286176e+03 4.876000e+00 1.180360e+02 7.682000e+01 6.992000e+00
 [96] 3.473920e+02 3.473920e+02 1.067200e+01 7.682000e+01 6.992000e+00
[101] 1.246600e+02 1.246600e+02 7.083080e+02 5.105080e+02 9.825600e+02
[106] 6.915180e+03 6.319480e+02 9.384000e+00 5.362496e+03 5.362496e+03
[111] 9.200000e-02 2.104960e+02 2.104960e+02 9.016000e+03 3.812480e+03
[116] 1.672560e+03 1.672560e+03 1.535296e+03 1.803200e+03 7.624960e+02
[121] 5.531040e+02 3.812480e+03 6.734400e+01 1.867600e+01 6.458400e+01
[126] 1.656000e+01 9.544356e+03 6.734400e+01 3.735200e+01 2.583360e+02
[131] 6.624000e+01 5.520000e+02 1.775600e+01 1.978000e+01 3.348800e+01
[136] 8.660420e+03 9.622280e+02 4.040640e+02 5.602800e+01 2.583360e+02
[141] 6.624000e+01 6.900000e+02 1.775600e+01 1.141720e+02 1.978000e+01
[146] 3.348800e+01 3.487122e+04 3.487122e+04 1.978000e+01 1.660600e+02
[151] 1.660600e+02 1.978000e+02 8.887200e+01 6.706800e+01 1.348904e+03
[156] 1.348904e+03 1.067200e+01 6.734400e+01 4.673600e+01 3.266000e+01
[161] 5.934000e+01 1.332160e+02 1.592520e+02 1.806328e+03 1.806328e+03
[166] 3.956000e+01 1.110440e+02 2.097600e+01 1.241080e+02 1.241080e+02
[171] 1.186800e+01 2.097600e+01 3.293600e+01 3.293600e+01 1.978000e+01
[176] 8.887200e+01 6.283600e+01 1.072720e+02 1.072720e+02 2.217200e+01
[181] 3.836400e+01 3.836400e+01 7.912000e+00 2.097600e+01 3.213560e+02
[186] 3.213560e+02 5.704000e+00 5.704000e+00 4.038800e+01 4.038800e+01
[191] 3.726000e+01 3.726000e+01 5.032400e+01 5.032400e+01 4.213600e+01
[196] 4.213600e+01 7.463500e+04 1.040520e+04 7.212800e+03 6.862464e+03
[201] 6.734400e+01 1.628400e+01 8.696760e+03 1.304514e+04 6.027840e+03
[206] 2.456400e+01 1.180452e+03 7.498497e+04 9.825600e+02 2.305520e+02
[211] 2.106800e+01 1.867600e+01 6.458400e+01 1.656000e+01 1.380000e+02

You can also add additional node or edge attribute data to the graph object. First, we make vectors of the attribute data based on the names. Let’s make one vector defining whether the node is a crab.

#vector for whether or not the node is a crab
CrabYN <- V(ChesUpper)$name 
CrabYN<- ifelse(CrabYN== "Blue Crab", "Y", "N")

Try it

Create a new node attribute that describes whether or not the vertex is a type of plankton, a type of bacteria or an “other” organism. Hint: you might consider using the %like% operator.

Click for solution
OrgType <- V(ChesUpper)$name 
#PlanktonYN %>% replace(PlanktonYN, PlanktonYN %like% "plankton", "Plankton")
OrgType<- ifelse(OrgType %like% "plankton","Plankton", ifelse(OrgType %like% "Bacteria", "Bacteria", "Other"))

Next, we can assign these attributes to the nodes in our graph. There are two ways to do this in igraph. You already know one of the ways from the lesson on dataframe manipulation. Assign the OrgType vector as a node attribute in our ChesUpper graph.

V(ChesUpper)$OrgType <- OrgType
V(ChesUpper)$OrgType
 [1] "Plankton" "Plankton" "Bacteria" "Bacteria" "Other"    "Other"   
 [7] "Other"    "Plankton" "Plankton" "Other"    "Other"    "Other"   
[13] "Other"    "Bacteria" "Other"    "Other"    "Other"    "Other"   
[19] "Other"    "Other"    "Other"    "Other"    "Other"    "Other"   
[25] "Other"    "Other"    "Other"    "Other"    "Other"    "Other"   
[31] "Other"    "Other"    "Other"    "Other"    "Other"    "Other"   
[37] "Other"   

igraph also has a built in function to set vertex and edge attributes. Use the function set.vertex.attribute() to create a new vertex attribute named CrabYN using the CrabYN vector.

ChesUpper <- set.vertex.attribute(ChesUpper,"CrabYN", value=CrabYN)
V(ChesUpper)$CrabYN
 [1] "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "Y"
[20] "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N" "N"

Now check out our graph data again: ChesUpper. Note that the new node attributes have been appended.

IGRAPH babc2a0 DNW- 37 215 -- Upper Chesapeake Bay in Summer
+ attr: Citation (g/c), Author (g/c), URL (g/c), name (g/c), name
| (v/c), ECO (v/n), Biomass (v/n), OrgType (v/c), CrabYN (v/c), weight
| (e/n)
+ edges from babc2a0 (vertex names):
 [1] Input      ->Net Phytoplankton Input      ->Picoplankton     
 [3] Input      ->Microphytobenthos Input      ->SAV              
 [5] Input      ->Oysters           Input      ->Blue Crab        
 [7] Input      ->American Eel      Input      ->DOC              
 [9] Input      ->Sediment POC      Input      ->POC              
[11] Oysters    ->Output            Blue Crab  ->Output           
+ ... omitted several edges

Visualizing networks

When the igraph package is loaded, you can simply run the plot() function on an igraph object and R will know what to do.

##make some quick plots
plot(ChesLower)

plot(ChesMiddle)

plot(ChesUpper)

The default graphical parameters do not look that great. Many nodes are overlapping with one another and the color schemes are not ideal. For now, let’s make the node labels and node size smaller and change the color. The default node size is 15, adjusting it to be smaller.

plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, vertex.color="tomato")

We can also turn the node labels completely off.

plot(ChesUpper,vertex.size=20,vertex.label=NA, vertex.color="tomato")

igraph has many different layouts for network visualization. Let’s try out some different layouts. Some of these will be more or less useful depending on the type of network structure you are plotting.

par(mfrow=c(3,2), mar=c(0.2,1,1,0.2))
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout.circle)
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout.star)
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout_as_tree)
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout.auto)
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout_nicely)
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout.grid)

par(mfrow=c(1,1))

Often network visualization uses kamada-kawai or fruchterman-reingold layouts. These layouts are what you will typically see in articles about network analysis.

##The ones we usually use
par(mfrow=c(1,2))
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout.fruchterman.reingold)
plot(ChesUpper,vertex.size=10,vertex.label.cex=0.5, layout=layout.kamada.kawai)

par(mfrow=c(1,1)) #reset parameters

Seems like kamada-kawai might be best suited for this network. Read a little more about force-directed graph drawing here

Try it

We can also assign vertex colors based on node attributes. Create a new node attribute of three different colors based on the OrgType attribute. Then assign this node attribute to the vertex colors and plot the network.

Click for solution
V(ChesUpper)$OrgColor <- ifelse(V(ChesUpper)$OrgType == "Bacteria","tomato", ifelse(V(ChesUpper)$OrgType == "Plankton","royal blue", "dark grey"))

plot(ChesUpper,vertex.size=6,vertex.label=NA, layout=layout.kamada.kawai, vertex.color=V(ChesUpper)$OrgColor)
legend(x="bottomleft", legend=unique(V(ChesUpper)$OrgType), pch=19, cex=0.7, col=unique(V(ChesUpper)$OrgColor))

We can also alter the arrow size, label font and many other plotting parameters to make a more interpretable graph.

plot(ChesUpper,vertex.size=8,vertex.label=NA, layout=layout.kamada.kawai, vertex.color="turquoise",vertex.label.family="Helvetica", vertex.label.color="goldenrod", edge.arrow.size=0.1, vertex.frame.color="corn silk", edge.color="light grey")

Part 2: Raiders of the Lost Ark

In the next section of the lesson, we will explore a dataset of scene interactions from the 1981 Indiana Jones movie, Raiders of the Lost Ark. These data are drawn from the MovieGalaxies database. The edges in this network represent the interactions between characters in this movie.

The first step is to load the data: dat <- read.csv("filepath", header=T, row.names=1). You can get the data here.

Take a look at the data: str(LostArk)

'data.frame':   134 obs. of  3 variables:
 $ ego   : chr  "BETSY" "BETSY" "BETSY" "BETSY" ...
 $ alter : chr  "MEPHISTO" "MARCUS" "ANGELA" "GUTTERBUHG" ...
 $ weight: int  1 1 1 1 1 1 2 2 1 1 ...

Or just the dataframe LostArk

    ego      alter weight
1 BETSY   MEPHISTO      1
2 BETSY     MARCUS      1
3 BETSY     ANGELA      1
4 BETSY GUTTERBUHG      1
5 BETSY    POSTMAN      1
6 BETSY      TEDDY      1

These data are in an edgelist format. In this format, the first column is the name of the ego node and the second column is the name of the alter. Any additional columns can contain edge attributes such as tie type, strength, value, etc. An edgelist can be converted into an igraph object using the graph.data.frame() function.

LostArkg<-graph.data.frame(LostArk)      

#read more about what this function does
?graph.data.frame

Check out a summary of our data: LostArkg. Notice the network attributes, such as node names and edge weight. Is this network directed or undirected? How many nodes and edges are in the network?

IGRAPH 78c6559 DNW- 32 134 -- 
+ attr: name (v/c), weight (e/n)
+ edges from 78c6559 (vertex names):
 [1] BETSY ->MEPHISTO      BETSY ->MARCUS        BETSY ->ANGELA       
 [4] BETSY ->GUTTERBUHG    BETSY ->POSTMAN       BETSY ->TEDDY        
 [7] BETSY ->DASHIELL      BETSY ->KEZURE        BETSY ->VIRGIL       
[10] BETSY ->DEAN COVENTRY BETSY ->INDIANA       BETSY ->SCRAGGY      
[13] BETSY ->CREW MEMBER   BETSY ->CLARE         BETSY ->PRIESTLY     
[16] BETSY ->JULIA         BETSY ->CHARLES       BETSY ->PORTER       
[19] BETSY ->REBECCA       BETSY ->SUN WU KUNG   MARCUS->INDIANA      
[22] ANGELA->INDIANA       ANGELA->PRIESTLY      ANGELA->POSTMAN      
+ ... omitted several edges
LostArkg
IGRAPH 78c6559 DNW- 32 134 -- 
+ attr: name (v/c), weight (e/n)
+ edges from 78c6559 (vertex names):
 [1] BETSY ->MEPHISTO      BETSY ->MARCUS        BETSY ->ANGELA       
 [4] BETSY ->GUTTERBUHG    BETSY ->POSTMAN       BETSY ->TEDDY        
 [7] BETSY ->DASHIELL      BETSY ->KEZURE        BETSY ->VIRGIL       
[10] BETSY ->DEAN COVENTRY BETSY ->INDIANA       BETSY ->SCRAGGY      
[13] BETSY ->CREW MEMBER   BETSY ->CLARE         BETSY ->PRIESTLY     
[16] BETSY ->JULIA         BETSY ->CHARLES       BETSY ->PORTER       
[19] BETSY ->REBECCA       BETSY ->SUN WU KUNG   MARCUS->INDIANA      
[22] ANGELA->INDIANA       ANGELA->PRIESTLY      ANGELA->POSTMAN      
+ ... omitted several edges

Plotting the Raiders of the Lost Ark network

Making a network plot of these data is simple: plot(LostArkg)

Looks good, but we can do better.

Try it

Adjust the edge arrow size, vertex and label size, colors, or other plotting parameters to make a more legible graph. Look at the plotting parameters for igraph on their website and package vignette to see which aspects of the plot cann be customized.

Click for solution

Network statistics

Network analysis is a powerful tool not only for visualization, but also for understanding power relationships and flows between actors. igraph includes functions for calculating network, dyadic, and node level statistics. First, let’s calculate some node centrality metrics.

V(LostArkg)$degree <- degree(LostArkg)
V(LostArkg)$betweenness <- betweenness(LostArkg)

Try it

Determine whether the default output from degree() is indegree, outdegree, or all ties. Create additional node attributes for the missing centrality measures. What do these metrics tell you about the roles and interactions between characters in this film?

Click for solution
V(LostArkg)$degree <- degree(LostArkg)
V(LostArkg)$degreeAll <- degree(LostArkg, mode=c("all"))
V(LostArkg)$degreeIn <- degree(LostArkg, mode=c("in"))
V(LostArkg)$degreeOut <- degree(LostArkg, mode=c("out"))

Try it

Determine who has the highest betweenness. Use both graphical and statistical methods to identify this character. Does this fit with your understanding of the plot of the movie?

Click for solution
degree(LostArkg) #look at the degree centrality for each node
        BETSY        MARCUS        ANGELA     YOUNG MAN         TEDDY 
           20             2            10             8            10 
       VIRGIL ELDERLY WOMAN       SCRAGGY      MACGOWAN     OLD WOMAN 
           10             8            10            10             8 
  CREW MEMBER   FIRST DRAFT       CHARLES       REBECCA         JULIA 
            3             8            10            10            10 
  SUN WU KUNG       INDIANA        DRIVER      DASHIELL    GUTTERBUHG 
            5            30             2             5             7 
      POSTMAN         WOMAN        KEZURE DEAN COVENTRY      MEPHISTO 
           10             8             5            10             4 
      OLD MAN         CLARE      PRIESTLY        PORTER           MAN 
            8            10            10             5             8 
     VILLAGER     GALBRAITH 
            2             2 
V(LostArkg)[degree(LostArkg)>mean(degree(LostArkg))] #select all the nodes for which the degree is greater than the mean degree
+ 14/32 vertices, named, from 78c6559:
 [1] BETSY         ANGELA        TEDDY         VIRGIL        SCRAGGY      
 [6] MACGOWAN      CHARLES       REBECCA       JULIA         INDIANA      
[11] POSTMAN       DEAN COVENTRY CLARE         PRIESTLY     
V(LostArkg)[degree(LostArkg) == max(degree(LostArkg))] #select the node with the maximum degree
+ 1/32 vertex, named, from 78c6559:
[1] INDIANA
V(LostArkg)[betweenness(LostArkg) == max(betweenness(LostArkg))]
+ 1/32 vertex, named, from 78c6559:
[1] INDIANA
V(LostArkg)[betweenness(LostArkg) == min(betweenness(LostArkg))]
+ 14/32 vertices, named, from 78c6559:
 [1] BETSY         YOUNG MAN     ELDERLY WOMAN SCRAGGY       POSTMAN      
 [6] WOMAN         KEZURE        DEAN COVENTRY OLD MAN       PRIESTLY     
[11] PORTER        MAN           VILLAGER      GALBRAITH    
betweenness(LostArkg)
        BETSY        MARCUS        ANGELA     YOUNG MAN         TEDDY 
    0.0000000     0.8750000     0.8750000     0.0000000     0.8750000 
       VIRGIL ELDERLY WOMAN       SCRAGGY      MACGOWAN     OLD WOMAN 
    0.8750000     0.0000000     0.0000000     1.0000000     5.5000000 
  CREW MEMBER   FIRST DRAFT       CHARLES       REBECCA         JULIA 
    0.5833333     5.5000000     0.8750000     0.8750000     0.8750000 
  SUN WU KUNG       INDIANA        DRIVER      DASHIELL    GUTTERBUHG 
   12.9583333   171.5000000    18.3333333     4.3333333     7.0833333 
      POSTMAN         WOMAN        KEZURE DEAN COVENTRY      MEPHISTO 
    0.0000000     0.0000000     0.0000000     0.0000000     4.9166667 
      OLD MAN         CLARE      PRIESTLY        PORTER           MAN 
    0.0000000     4.0000000     0.0000000     0.0000000     0.0000000 
     VILLAGER     GALBRAITH 
    0.0000000     0.0000000 

plot(LostArkg,vertex.label.family="Helvetica", vertex.label.cex=0.4, vertex.color="goldenrod",vertex.label.family="Helvetica", vertex.label.color="royal blue", edge.arrow.size=0.1, vertex.frame.color="bisque", edge.color=adjustcolor("light grey",0.2), vertex.size=(V(LostArkg)$betweenness/5+3))

Using nodal attributes, we can compare across the different nodes. Here we plot two different networks sized by indegree and outdegree. How might we interpret these results?


par(mfrow=c(1,2))
plot(LostArkg,vertex.label.family="Helvetica", vertex.label.cex=0.3, vertex.color="goldenrod",vertex.label.family="Helvetica", vertex.label.color="royal blue", edge.arrow.size=0.1, vertex.frame.color="bisque", edge.color=adjustcolor("light grey",.8), vertex.size=(V(LostArkg)$degreeOut*2), edge.width=E(LostArkg)$weight, main="Out degree")

plot(LostArkg,vertex.label.family="Helvetica", vertex.label.cex=0.3, vertex.color="goldenrod",vertex.label.family="Helvetica", vertex.label.color="royal blue", edge.arrow.size=0.1, vertex.frame.color="bisque", edge.color=adjustcolor("light grey",0.8), edge.width=E(LostArkg)$weight, vertex.size=(V(LostArkg)$degreeIn*2), main="In degree")

par(mfrow=c(1,1))

igraph also includes functions for calculating closeness and eigenvector centrality. You can read more about these functions here

Community detection algorithms

There are numerous community detection algorithms for network analysis in igraph. These algorithms allow analyists to divide a network into smaller communities for analysis of the modularity and connectedness of various subgroups. Different algorithms will be appropriate depending on which analysis you intend to conduct. Some available igraph functions include: infomap.community(), edge.betweenness.community(), fastgreedy.community() and others. Here we will use the spinglass.community() function to identify subgroups in our network.

#set seed and run community detection algorithm
set.seed(1000)
sg1 <- spinglass.community(LostArkg)

##plot communities with shaded background for each community
plot(sg1, LostArkg,
   col=membership(sg1),
   mark.groups=communities(sg1),
   vertex.size=15,
     vertex.label=NA,
   edge.arrow.size=0.5,
   main="Scene interaction communities")


#plot network colored by community
plot(LostArkg,
   vertex.color=membership(sg1),
   vertex.size=15,
  vertex.label.cex=0.3, vertex.label.family="Helvetica", vertex.frame.color="white", vertex.label.color="black",
   edge.arrow.size=0.2,
  edge.width=E(LostArkg)$weight,
   main="Scene interaction communities")

Try it

We can also compare the community detection algorithms across several different methods. The best algorithm to use will depend on the network type and your interpretation of the results. Try running and plotting two other community detection algorithms (check out the package documentation if you are stuck). Which algorithm do you think best explains the interactions between characters in this film? If you were to name or describe each detected community, what would you call them?

Click for solution

#set.seed(1000)
#fg <- fastgreedy.community(LostArkg) #only works on undirected graphs

set.seed(1000)
wt <- walktrap.community(LostArkg)

set.seed(1000)

eb<- edge.betweenness.community(LostArkg)


par(mfrow=c(1,3))
plot(LostArkg,
   vertex.color=sg1$membership,
   vertex.size=15,
  vertex.label.cex=0.3, vertex.label.family="Helvetica", vertex.frame.color="white", vertex.label.color="black",
   edge.arrow.size=0.2,
  edge.width=E(LostArkg)$weight,
   main="Spinglass")

plot(LostArkg,
   vertex.color=eb$membership,
   vertex.size=15,
  vertex.label.cex=0.3, vertex.label.family="Helvetica", vertex.frame.color="white", vertex.label.color="black",
   edge.arrow.size=0.2,
  edge.width=E(LostArkg)$weight,
   main="Edge Betweenness")

plot(LostArkg,
   vertex.color=wt$membership,
   vertex.size=15,
  vertex.label.cex=0.3, vertex.label.family="Helvetica", vertex.frame.color="white", vertex.label.color="black",
   edge.arrow.size=0.2,
  edge.width=E(LostArkg)$weight,
   main="Walktrap")


par(mfrow=c(1,1))

Edge reciprocity and path length

Let’s go back to the Chesapeake Bay foodweb data to analyze the reciprocity in the network. Reciprocity is often of interest to anthropologists, whether we study food sharing, social support, or other interactions. Advanced network analysis can also include assessing transitivity and triangles.

reciprocity(ChesUpper) # what proportion of ties are reciprocated
[1] 0.1401869
dyad_census(ChesUpper) # the number of dyads that are mutual, asymmetrical or null
$mut
[1] 15

$asym
[1] 184

$null
[1] 467

We can also calculate the average and shortest path lengths. Let’s find the shortest path between Blue Crabs and Free Bacteria.

mean_distance(ChesUpper, directed=T)
[1] 13997.97
V(ChesUpper)$name
 [1] "Net Phytoplankton"          "Picoplankton"              
 [3] "Free Bacteria"              "Particle Attached Bacteria"
 [5] "Heteroflagellates"          "Ciliates"                  
 [7] "Rotifers"                   "Meroplankton"              
 [9] "Mesozooplankton"            "Ctenophores"               
[11] "Chrysaora"                  "Microphytobenthos"         
[13] "SAV"                        "Benthic Bacteria"          
[15] "Meiofauna"                  "Deposit Feeding Benthos"   
[17] "Suspension Feeding Benthos" "Oysters"                   
[19] "Blue Crab"                  "Menhaden"                  
[21] "Bay anchovy"                "Herrings and Shads"        
[23] "White Perch"                "Spot"                      
[25] "Croaker"                    "Hogchoker"                 
[27] "American Eel"               "Catfish"                   
[29] "Striped Bass"               "Bluefish"                  
[31] "Weakfish"                   "DOC"                       
[33] "Sediment POC"               "POC"                       
[35] "Input"                      "Output"                    
[37] "Respiration"               

BlueCrabtoFreeBacteria <- shortest_paths(ChesUpper,
from = V(ChesUpper)[name=="Blue Crab"],
to = V(ChesUpper)[name=="Free Bacteria"],
output = "both")

# highlight shortest path. Adapted from: https://kateto.net/wp-content/uploads/2016/01/NetSciX_2016_Workshop.pdf

edgecolors <- rep("light gray", ecount(ChesUpper))
edgecolors[unlist(BlueCrabtoFreeBacteria$epath)] <- "turquoise"

edgewidth <- rep(2, ecount(ChesUpper))
edgewidth[unlist(BlueCrabtoFreeBacteria$epath)] <- 4

vertexcolors <- rep("light gray", vcount(ChesUpper))
vertexcolors[unlist(BlueCrabtoFreeBacteria$vpath)] <- "turquoise"

vertexnames<- rep(NA, vcount(ChesUpper))
vertexnames[unlist(BlueCrabtoFreeBacteria$vpath)] <- names(unlist(BlueCrabtoFreeBacteria$vpath))

plot(ChesUpper, vertex.color=vertexcolors, edge.color=edgecolors,
edge.width=edgewidth, edge.arrow.mode=0, layout=layout.kamada.kawai, vertex.label=vertexnames, vertex.size=5, vertex.label.cex=0.8, vertex.label.dist=2)

Next steps

There is a ton more that you can do with networks in R. igraph, as covered here, is particularly helpful for network visualizations. If you are interested in going further into network statistics check out the statnet packages.