- 1. Getting Started
- 2. Statistical network modeling; the
*summary*and*ergm*commands, and supporting functions - 3. Model terms available for
*ergm*estimation and simulation - 4. Network simulation: the
*simulate*command and*network.list*objects - 5. Examining the quality of model fit – GOF
- 6. Diagnostics: troubleshooting and checking for model degeneracy
- 7. Working with egocentrically sampled network data
- 8. Additional functionality in the statnet family of packages
- Appendix A: Clarifying the terms – ergm and network
- References

*Last updated 07-01-2014*

*This tutorial is a joint product of the Statnet Development Team:*

Martina Morris (University of Washington)

Mark S. Handcock (University of California, Los Angeles)

Carter T. Butts (University of California, Irvine)

David R. Hunter (Penn State University)

Steven M. Goodreau (University of Washington)

Skye Bender de-Moll (Oakland)

Pavel N. Krivitsky (University of Wollongong)

For general questions and comments, please refer to statnet users group and mailing list

http://statnet.csde.washington.edu/statnet_users_group.shtml

Open an R session, and set your working directory to the location where you would like to save this work.

To install all of the packages in the statnet suite:

```
install.packages('statnet')
library(statnet)
```

Or, to only install the specific statnet packages needed for this tutorial:

```
install.packages('ergm') # will install the network package
install.packages('sna')
```

After the first time, to update the packages one can either repeat the commands above, or use:

`update.packages('name.of.package')`

For this tutorial, we will need one additional package (coda), which is recommended (but not required) by ergm:

`install.packages('coda')`

Make sure the packages are attached:

`library(statnet)`

or

```
library(ergm)
library(sna)
library(coda)
```

Check package version

```
# latest versions: ergm 3.1.2 and network 1.9.0 (as of 6/17/2014)
sessionInfo()
```

Set seed for simulations – this is not necessary, but it ensures that we all get the same results (if we execute the same commands in the same order).

`set.seed(0)`

Exponential-family random graph models (ERGMs) represent a general class of models based in exponential-family theory for specifying the probability distribution for a set of random graphs or networks. Within this framework, one can—among other tasks—obtain maximum-likehood estimates for the parameters of a specified model for a given data set; test individual models for goodness-of-fit, perform various types of model comparison; and simulate additional networks with the underlying probability distribution implied by that model.

The general form for an ERGM can be written as:

\[ P(Y=y)=\frac{\exp(\theta'g(y))}{k(\theta)} \]

where Y is the random variable for the state of the network (with realization y), \(g(y)\) is a vector of model statistics for network y, \(\theta\) is the vector of coefficients for those statistics, and \(k(\theta)\) represents the quantity in the numerator summed over all possible networks (typically constrained to be all networks with the same node set as y).

This can be re-expressed in terms of the conditional log-odds of a single tie between two actors:

\[ \operatorname{logit}{(Y_{ij}=1|y^{c}_{ij})=\theta'\delta(y_{ij})} \]

where \(Y_{ij}\) is the random variable for the state of the actor pair \(i,j\) (with realization \(y_{ij}\)), and \(y^{c}_{ij}\) signifies the complement of \(y_{ij}\), i.e. all dyads in the network other than \(y_{ij}\). The vector \(\delta(y_{ij})\) contains the “change statistic” for each model term. The change statistic records how \(g(y)\) term changes if the \(y_{ij}\) tie is toggled on or off. So:

\[ \delta(y_{ij}) = g(y^{+}_{ij})-g(y^{-}_{ij}) \]

where \(y^{+}_{ij}\) is defined as \(y^{c}_{ij}\) along with \(y_{ij}\) set to 1, and \(y^{-}_{ij}\) is defined as \(y^{c}_{ij}\) along with \(y_{ij}\) set to 0. That is, \(\delta(y_{ij})\) equals the value of \(g(y)\) when \(y_{ij}=1\) minus the value of \(g(y)\) when \(y_{ij}=0\), but all other dyads are as in \(g(y)\).

This emphasizes that the coefficient \(\theta\) can be interpreted as the log-odds of an individual tie conditional on all others.

The model terms \(g(y)\) are functions of network statistics that we hypothesize may be more or less common than what would be expected in a simple random graph (where all ties have the same probability). For example, specific degree distributions, or triad configurations, or homophily on nodal attributes. We will explore some of these terms in this tutorial, and links to more information are provided in section 3.

One key distinction in model terms is worth keeping in mind: terms are either *dyad independent* or *dyad dependent*. Dyad independent terms (like nodal homophily terms) imply no dependence between dyads—the presence or absence of a tie may depend on nodal attributes, but not on the state of other ties. Dyad dependent terms (like degree terms, or triad terms), by contrast, imply dependence between dyads. Such terms have very different effects, and much of what is different about network models comes from the complex cascading effects that these terms introduce. A model with dyad dependent terms also requires a different estimation algorithm, and you will see some different components in the output.

We’ll start by running some simple models to demonstrate the use of the “summary” and “ergm” commands. The ergm package contains several network data sets that we will use for demonstration purposes here.

`data(package='ergm') # tells us the datasets in our packages`

We begin with the simplest possible model, the Bernoulli or Erdos-Renyi model, which contains only one term to capture the density of the network as a function of a homogenous edge probability. The ergm-term for this is ** edge**. We’ll fit this simple model to Padgett’s Florentine marriage network. As with all data analysis, we start by looking at our data: using graphical and numerical descriptives.

```
data(florentine) # loads flomarriage and flobusiness data
flomarriage # Let's look at the flomarriage network properties
```

```
Network attributes:
vertices = 16
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 20
missing edges= 0
non-missing edges= 20
Vertex attribute names:
priorates totalties vertex.names wealth
No edge attributes
```

```
par(mfrow=c(1,2)) # Setup a 2 panel plot (for later)
plot(flomarriage, main="Florentine Marriage", cex.main=0.8) # Plot the flomarriage network
summary(flomarriage~edges) # Look at the $g(y)$ statistic for this model
```

```
edges
20
```

```
flomodel.01 <- ergm(flomarriage~edges) # Estimate the model
summary(flomodel.01) # The fitted model object
```

```
==========================
Summary of model fit
==========================
Formula: flomarriage ~ edges
Iterations: 20
Monte Carlo MLE Results:
Estimate Std. Error MCMC % p-value
edges -1.609 0.245 NA <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 166 on 120 degrees of freedom
Residual Deviance: 108 on 119 degrees of freedom
AIC: 110 BIC: 113 (Smaller is better.)
```

How should we interpret the coefficient from this model? The log-odds of any tie existing is:

\[ \small{ \begin{eqnarray*} & = & -1.609\times\mbox{change in the number of ties}\\ & = & -1.609\times1 \end{eqnarray*} } \]

for all ties, since the addition of any tie to the network always changes the number of ties by 1 for a tie toggled from 0 to 1 (or by -1 for a tie toggled from 1 to 0).

The corresponding probability is:

\[ \small{ \begin{eqnarray*} & = & \exp(-1.609)/(1+\exp(-1.609))\\ & = & 0.1667 \end{eqnarray*} } \]

which corresponds to the density we observe in the flomarriage network: there are 20 ties and (16 choose 2 = 16*15/2 =) 120 dyads.

Let’s add a term often thought to be a measure of “clustering”: the number of completed triangles. The ergm-term for this is ** triangle**. This is a dyad dependent term. As a result, the estimation algorithm automatically changes to MCMC, and because this is a form of stochastic estimation your results may differ slightly.

`summary(flomarriage~edges+triangle) # Look at the g(y) stats for this model`

```
edges triangle
20 3
```

`flomodel.02 <- ergm(flomarriage~edges+triangle) `

```
Iteration 1 of at most 20:
Convergence test P-value: 1.6e-29
The log-likelihood improved by 0.001997
Iteration 2 of at most 20:
Convergence test P-value: 7e-09
The log-likelihood improved by 0.0005728
Iteration 3 of at most 20:
Convergence test P-value: 1.4e-02
The log-likelihood improved by 0.0001287
Iteration 4 of at most 20:
Convergence test P-value: 2.2e-01
The log-likelihood improved by < 0.0001
Iteration 5 of at most 20:
Convergence test P-value: 8.9e-01
Convergence detected. Stopping.
The log-likelihood improved by < 0.0001
This model was fit using MCMC. To examine model diagnostics and check for degeneracy, use the mcmc.diagnostics() function.
```

`summary(flomodel.02)`

```
==========================
Summary of model fit
==========================
Formula: flomarriage ~ edges + triangle
Iterations: 20
Monte Carlo MLE Results:
Estimate Std. Error MCMC % p-value
edges -1.675 0.346 0 <1e-04 ***
triangle 0.159 0.578 0 0.78
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 166 on 120 degrees of freedom
Residual Deviance: 108 on 118 degrees of freedom
AIC: 112 BIC: 118 (Smaller is better.)
```

Now, how should we interpret coefficients?

The conditional log-odds of two actors having a tie is:

\[ \small{ -1.68\times\mbox{change in the number of ties}+0.16\times\mbox{change in number of triangles} } \]

- For a tie that will create no triangles, the conditional log-odds is: \(-1.68\).
- if one triangle: \(-1.68 + 0.16 =-1.52\)
- if two triangles: \(-1.68 +0.16\times2=-1.36\)
- the corresponding probabilities are 0.16, 0.18, and 0.20.

Let’s take a closer look at the ergm object itself:

`class(flomodel.02) # this has the class ergm`

`[1] "ergm"`

`names(flomodel.02) # the ERGM object contains lots of components.`

```
[1] "coef" "sample" "sample.obs" "iterations"
[5] "MCMCtheta" "loglikelihood" "gradient" "hessian"
[9] "covar" "failure" "mc.se" "network"
[13] "newnetwork" "coef.init" "initialfit" "coef.hist"
[17] "stats.hist" "etamap" "formula" "target.stats"
[21] "constrained" "constraints" "control" "reference"
[25] "estimate" "offset" "drop" "estimable"
[29] "null.lik" "mle.lik"
```

`flomodel.02$coef # you can extract/inspect individual components`

```
edges triangle
-1.6749 0.1591
```

We can test whether edge probabilities are a function of wealth. This is a nodal covariate, so we use the ergm-term **nodecov**.

```
wealth <- flomarriage %v% 'wealth' # %v% references vertex attributes
wealth
```

` [1] 10 36 55 44 20 32 8 42 103 48 49 3 27 10 146 48`

`summary(wealth) # summarize the distribution of wealth`

```
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.0 17.5 39.0 42.6 48.2 146.0
```

`plot(flomarriage, vertex.cex=wealth/25, main="Florentine marriage by wealth", cex.main=0.8) # network plot with vertex size proportional to wealth`

`summary(flomarriage~edges+nodecov('wealth')) # observed statistics for the model`

```
edges nodecov.wealth
20 2168
```

```
flomodel.03 <- ergm(flomarriage~edges+nodecov('wealth'))
summary(flomodel.03)
```

```
==========================
Summary of model fit
==========================
Formula: flomarriage ~ edges + nodecov("wealth")
Iterations: 20
Monte Carlo MLE Results:
Estimate Std. Error MCMC % p-value
edges -2.59493 0.53606 NA <1e-04 ***
nodecov.wealth 0.01055 0.00467 NA 0.026 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 166 on 120 degrees of freedom
Residual Deviance: 103 on 118 degrees of freedom
AIC: 107 BIC: 113 (Smaller is better.)
```

Yes, there is a significant positive wealth effect on the probability of a tie.

How do we interpret the coefficients here? Note that the wealth effect operates on both nodes in a dyad. The conditional log-odds of a tie between two actors is:

\[ \small{ -2.59\times\mbox{change in the number of ties} + 0.01\times\mbox{the wealth of node 1} + 0.01\times\mbox{the wealth of node 2} } \]

\[ \small{ -2.59\times\mbox{change in the number of ties} + 0.01\times\mbox{the sum of the wealth of the two nodes} } \]

- for a tie between two nodes with minimum wealth, the conditional log-odds is:

\(-2.59 + 0.01*(3+3) = -2.53\) - for a tie between two nodes with maximum wealth:

\(-2.59 + 0.01*(146+146) = 0.33\) - for a tie between the node with maximum wealth and the node with minimum wealth:

\(-2.59 + 0.01*(146+3) = -1.1\) - The corresponding probabilities are 0.07, 0.58, and 0.25.

Note: This model specification does not include a term for homophily by wealth. It just specifies a relation between wealth and mean degree. To specify homophily on wealth, you would use the ergm-term ** absdiff** see section 3 below for more information on ergm-terms

Let’s try a larger network, a simulated mutual friendship network based on one of the schools from the Add Health study. Here, we’ll examine the homophily in friendships by grade and race. Both are discrete attributes so we use the ergm-term ** nodematch**.

```
data(faux.mesa.high)
mesa <- faux.mesa.high
```

`mesa`

```
Network attributes:
vertices = 205
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 203
missing edges= 0
non-missing edges= 203
Vertex attribute names:
Grade Race Sex
No edge attributes
```

```
par(mfrow=c(1,1)) # Back to 1-panel plots
plot(mesa, vertex.col='Grade')
legend('bottomleft',fill=7:12,legend=paste('Grade',7:12),cex=0.75)
```

`fauxmodel.01 <- ergm(mesa ~edges + nodematch('Grade',diff=T) + nodematch('Race',diff=T))`

`Observed statistic(s) nodematch.Race.Black and nodematch.Race.Other are at their smallest attainable values. Their coefficients will be fixed at -Inf.`

`summary(fauxmodel.01)`

```
==========================
Summary of model fit
==========================
Formula: mesa ~ edges + nodematch("Grade", diff = T) + nodematch("Race",
diff = T)
Iterations: 20
Monte Carlo MLE Results:
Estimate Std. Error MCMC % p-value
edges -6.2343 0.1741 NA <1e-04 ***
nodematch.Grade.7 2.8749 0.1981 NA <1e-04 ***
nodematch.Grade.8 2.8785 0.2391 NA <1e-04 ***
nodematch.Grade.9 2.4621 0.2647 NA <1e-04 ***
nodematch.Grade.10 2.5703 0.3770 NA <1e-04 ***
nodematch.Grade.11 3.2930 0.2977 NA <1e-04 ***
nodematch.Grade.12 3.8384 0.4592 NA <1e-04 ***
nodematch.Race.Black -Inf NA NA NA
nodematch.Race.Hisp 0.0692 0.1737 NA 0.690
nodematch.Race.NatAm 0.9832 0.1842 NA <1e-04 ***
nodematch.Race.Other -Inf NA NA NA
nodematch.Race.White 1.2694 0.5370 NA 0.018 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 28987 on 20910 degrees of freedom
Residual Deviance: 1899 on 20898 degrees of freedom
AIC: 1923 BIC: 2018 (Smaller is better.)
Warning: The following terms have infinite coefficient estimates:
nodematch.Race.Black nodematch.Race.Other
```

Note that two of the coefficients are estimated as -Inf (the nodematch coefficients for race Black and Other). Why is this?

`table(mesa %v% 'Race') # Frequencies of race`

```
Black Hisp NatAm Other White
6 109 68 4 18
```

`mixingmatrix(mesa, "Race")`

```
Note: Marginal totals can be misleading
for undirected mixing matrices.
Black Hisp NatAm Other White
Black 0 8 13 0 5
Hisp 8 53 41 1 22
NatAm 13 41 46 0 10
Other 0 1 0 0 0
White 5 22 10 0 4
```

The problem is that there are very few students in the Black and Other race categories, and these few students form no within-group ties. The empty cells are what produce the -Inf estimates.

Note that we would have caught this earlier if we had looked at the \(g(y)\) stats at the beginning:

`summary(mesa ~edges + nodematch('Grade',diff=T) + nodematch('Race',diff=T))`

```
edges nodematch.Grade.7 nodematch.Grade.8
203 75 33
nodematch.Grade.9 nodematch.Grade.10 nodematch.Grade.11
23 9 17
nodematch.Grade.12 nodematch.Race.Black nodematch.Race.Hisp
6 0 53
nodematch.Race.NatAm nodematch.Race.Other nodematch.Race.White
46 0 4
```

**Moral**: It’s a good idea to check the descriptive statistics of a model in the observed network before fitting the model.

See also the ergm-term ** nodemix** for fitting mixing patterns other than homophily on discrete nodal attributes.

Let’s try a model for a directed network, and examine the tendency for ties to be reciprocated (“mutuality”). The ergm-term for this is ** mutual**. We’ll fit this model to the third wave of the classic Sampson Monastery data, and we’ll start by taking a look at the network.

```
data(samplk)
ls() # directed data: Sampson's Monks
```

```
[1] "faux.mesa.high" "fauxmodel.01" "flobusiness" "flomarriage"
[5] "flomodel.01" "flomodel.02" "flomodel.03" "mesa"
[9] "metadata" "samplk1" "samplk2" "samplk3"
[13] "wealth"
```

`samplk3`

```
Network attributes:
vertices = 18
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 56
missing edges= 0
non-missing edges= 56
Vertex attribute names:
cloisterville group vertex.names
No edge attributes
```

`plot(samplk3)`

`summary(samplk3~edges+mutual)`

```
edges mutual
56 15
```

The plot now shows the direction of a tie, and the \(g(y)\) statistics for this model in this network are 56 total ties, and 15 mutual dyads (so 30 of the 56 ties are mutual ties).

`sampmodel.01 <- ergm(samplk3~edges+mutual)`

```
Iteration 1 of at most 20:
Convergence test P-value: 8.9e-01
Convergence detected. Stopping.
The log-likelihood improved by < 0.0001
This model was fit using MCMC. To examine model diagnostics and check for degeneracy, use the mcmc.diagnostics() function.
```

`summary(sampmodel.01)`

```
==========================
Summary of model fit
==========================
Formula: samplk3 ~ edges + mutual
Iterations: 20
Monte Carlo MLE Results:
Estimate Std. Error MCMC % p-value
edges -2.155 0.218 0 <1e-04 ***
mutual 2.299 0.480 0 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 424 on 306 degrees of freedom
Residual Deviance: 268 on 304 degrees of freedom
AIC: 272 BIC: 279 (Smaller is better.)
```

There is a strong and significant mutuality effect. The coefficients for the edges and mutual terms roughly cancel for a mutual tie, so the conditional odds of a mutual tie are about even, and the probability is about 50%. By contrast a non-mutual tie has a conditional log-odds of -2.16, or 10% probability.

Triangle terms in directed networks can have many different configurations, given the directional ties. Many of these configurations are coded up as ergm-terms (and we’ll talk about these more below).

It is important to distinguish between the absence of a tie, and the absence of data on whether a tie exists. You should not code both of these as “0”. The \(ergm\) package recognizes and handles missing data appropriately, as long as you identify the data as missing. Let’s explore this with a simple example.

Let’s start with estimating an ergm on a network with two missing ties, where both ties are identified as missing.

```
missnet <- network.initialize(10,directed=F)
missnet[1,2] <- missnet[2,7] <- missnet[3,6] <- 1
missnet[4,6] <- missnet[4,9] <- missnet[5,6] <- NA
summary(missnet)
```

```
Network attributes:
vertices = 10
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges = 6
missing edges = 3
non-missing edges = 3
density = 0.06667
Vertex attributes:
vertex.names:
character valued attribute
10 valid vertex names
No edge attributes
Network adjacency matrix:
1 2 3 4 5 6 7 8 9 10
1 0 1 0 0 0 0 0 0 0 0
2 1 0 0 0 0 0 1 0 0 0
3 0 0 0 0 0 1 0 0 0 0
4 0 0 0 0 0 NA 0 0 NA 0
5 0 0 0 0 0 NA 0 0 0 0
6 0 0 1 NA NA 0 0 0 0 0
7 0 1 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0
9 0 0 0 NA 0 0 0 0 0 0
10 0 0 0 0 0 0 0 0 0 0
```

```
# plot missnet with missing edge colored red.
tempnet <- missnet
tempnet[4,6] <- tempnet[4,9] <- tempnet[5,6] <- 1
missnetmat <- as.matrix(missnet)
missnetmat[is.na(missnetmat)] <- 2
plot(tempnet,label = network.vertex.names(tempnet),edge.col = missnetmat)
```

`summary(missnet~edges)`

```
edges
3
```

`summary(ergm(missnet~edges))`

```
==========================
Summary of model fit
==========================
Formula: missnet ~ edges
Iterations: 20
Monte Carlo MLE Results:
Estimate Std. Error MCMC % p-value
edges -2.565 0.599 NA 0.00011 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 58.2 on 42 degrees of freedom
Residual Deviance: 21.6 on 41 degrees of freedom
AIC: 23.6 BIC: 25.4 (Smaller is better.)
```

The coefficient equals -2.56, which corresponds to a probability of 7.14%. Our network has 3 ties, out of the 42 non-missing nodal pairs (10 choose 2 minus 3): 3/42 = 7.14%. So our estimate represents the probability of a tie in the observed sample.

Now let’s assign those missing ties the value “0” and see what happens.

```
missnet_bad <- missnet
missnet_bad[4,6] <- missnet_bad[4,9] <- missnet_bad[5,6] <- 0
summary(missnet_bad)
```

```
Network attributes:
vertices = 10
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges = 3
missing edges = 0
non-missing edges = 3
density = 0.06667
Vertex attributes:
vertex.names:
character valued attribute
10 valid vertex names
No edge attributes
Network adjacency matrix:
1 2 3 4 5 6 7 8 9 10
1 0 1 0 0 0 0 0 0 0 0
2 1 0 0 0 0 0 1 0 0 0
3 0 0 0 0 0 1 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
6 0 0 1 0 0 0 0 0 0 0
7 0 1 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0 0 0 0
10 0 0 0 0 0 0 0 0 0 0
```

`summary(ergm(missnet_bad~edges))`

```
==========================
Summary of model fit
==========================
Formula: missnet_bad ~ edges
Iterations: 20
Monte Carlo MLE Results:
Estimate Std. Error MCMC % p-value
edges -2.639 0.598 NA <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 62.4 on 45 degrees of freedom
Residual Deviance: 22.0 on 44 degrees of freedom
AIC: 24 BIC: 25.9 (Smaller is better.)
```

The coefficient is smaller now because the missing ties are counted as “0”, and translates to a conditional tie probability of 6.67%. It’s a small difference in this case (and a small network, with little missing data).

MORAL: If you have missing data on ties, be sure to identify them by assigning the “NA” code. This is particularly important if you’re reading in data as an edgelist, as all dyads without edges are implicitly set to “0” in this case.

Model terms are the expressions (e.g. “triangle”) used to represent predictors on the right-hand size of equations used in:

- calls to
`summary`

(to obtain measurements of network statistics on a dataset) - calls to
`ergm`

(to estimate an ergm model) - calls to
`simulate`

(to simulate networks from an ergm model fit)

Many ERGM terms are simple counts of configurations (e.g., edges, nodal degrees, stars, triangles), but others are more complex functions of these configurations (e.g., geometrically weighted degrees and shared partners). In theory, any configuration (or function of configurations) can be a term in an ERGM. In practice, however, these terms have to be constructed before they can be used—that is, one has to explicitly write an algorithm that defines and calculates the network statistic of interest. This is another key way that ERGMs differ from traditional linear and general linear models.

The terms that can be used in a model also depend on the type of network being analyzed: directed or undirected, one-mode or two-mode (“bipartite”), binary or valued edges.

For a list of available terms that can be used to specify an ERGM, type:

`help('ergm-terms')`

A table of commonly used terms can be found here

A more complete discussion of many of these terms can be found in the ‘Specifications’ paper in the *Journal of Statistical Software v24(4)*

Finally, note that models with only dyad independent terms are estimated in statnet using a logistic regression algorithm to maximize the likelihood. Dyad dependent terms require a different approach to estimation, which, in statnet, is based on a Monte Carlo Markov Chain (MCMC) algorithm that stochastically approximates the Maximum Likelihood.

We have recently released a new package (`ergm.userterms`

) that makes it much easier to write one’s own ergm-terms. The package is available on CRAN, and installing it will include the tutorial (ergmuserterms.pdf). Alternatively, the tutorial can be found in the *Journal of Statistical Software 52(2)*, and some introductory slides from the workshop we teach on coding ergm-terms can be found here.

Note that writing up new `ergm`

terms requires some knowledge of C and the ability to build R from source (although the latter is covered in the tutorial, the many environments for building R and the rapid changes in these environments make these instructions obsolete quickly).

Once we have estimated the coefficients of an ERGM, the model is completely specified. It defines a probability distribution across all networks of this size. If the model is a good fit to the observed data, then networks drawn from this distribution will be more likely to “resemble” the observed data. To see examples of networks drawn from this distribution we use the `simulate`

command:

```
flomodel.03.sim <- simulate(flomodel.03,nsim=10)
class(flomodel.03.sim)
```

`[1] "network.list"`

`summary(flomodel.03.sim)`

```
Number of Networks: 10
Model: flomarriage ~ edges + nodecov("wealth")
Reference: ~Bernoulli
Constraints: ~.
Parameters:
edges nodecov.wealth
-2.59493 0.01055
Stored network statistics:
edges nodecov.wealth
[1,] 23 2327
[2,] 19 1975
[3,] 15 1731
[4,] 14 1766
[5,] 26 2534
[6,] 23 2618
[7,] 22 2341
[8,] 21 2389
[9,] 24 2661
[10,] 26 2647
```

`length(flomodel.03.sim)`

`[1] 10`

`flomodel.03.sim[[1]]`

```
Network attributes:
vertices = 16
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 23
missing edges= 0
non-missing edges= 23
Vertex attribute names:
priorates totalties vertex.names wealth
No edge attributes
```

`plot(flomodel.03.sim[[1]], label= flomodel.03.sim[[1]] %v% "vertex.names")`

Voila. Of course, yours will look somewhat different.

Simulation can be used for many purposes: to examine the range of variation that could be expected from this model, both in the sufficient statistics that define the model, and in other statistics not explicitly specified by the model. Simulation will play a large role in analyizing egocentrically sampled data in section 7 below. And if you take the `tergm`

workshop, you will see how we can use simulation to examine the temporal implications of a model based on a single cross-sectional egocentrically sampled dataset.

For now, we will examine one of the primary uses of simulation in the ergm package: using simulated data from the model to evaluate goodness of fit to the observed data.

ERGMs can be seen as generative models when they represent the process that governs the global patterns of tie prevalence from a local perspective: the perspective of the nodes involved in the particular micro-configurations represented by the ergm-terms in the model. The locally generated processes in turn aggregate up to produce characteristic global network properties, even though these global properties are not explicit terms in the model.

One test of whether a local model “fits the data” is therefore how well it reproduces the observed global network properties *that are not in the model*. We do this by choosing a network statistic that is not in the model, and comparing the value of this statistic observed in the original network to the distribution of values we get in simulated networks from our model, using the **gof** function.

The **gof** function is a bit different than the **summary**, **ergm**, and **simulate** functions, in that it currently only takes 3 ergm-terms as arguments: degree, esp (edgwise share partners), and distance (geodesic distances). Each of these terms captures an aggregate network distribution, at either the node level (degree), the edge level (esp), or the dyad level (distance).

```
flomodel.03.gof <- gof(flomodel.03~degree + esp + distance)
flomodel.03.gof
```

```
Goodness-of-fit for degree
obs min mean max MC p-value
0 1 0 1.55 6 1.00
1 4 0 3.28 7 0.90
2 2 0 4.19 10 0.40
3 6 0 3.19 7 0.08
4 2 0 2.08 6 1.00
5 0 0 0.94 3 0.88
6 1 0 0.43 2 0.68
7 0 0 0.24 2 1.00
8 0 0 0.09 1 1.00
9 0 0 0.01 1 1.00
Goodness-of-fit for edgewise shared partner
obs min mean max MC p-value
esp0 12 5 11.64 19 1.00
esp1 7 0 6.44 17 0.96
esp2 1 0 1.50 10 1.00
esp3 0 0 0.07 1 1.00
esp4 0 0 0.01 1 1.00
Goodness-of-fit for minimum geodesic distance
obs min mean max MC p-value
1 20 7 19.66 31 1.00
2 35 5 33.17 64 0.90
3 32 2 25.10 43 0.50
4 15 0 10.13 24 0.50
5 3 0 3.03 17 0.82
6 0 0 0.88 11 1.00
7 0 0 0.21 6 1.00
8 0 0 0.05 2 1.00
Inf 15 0 27.77 103 0.96
```

`plot(flomodel.03.gof)`

```
mesamodel.02 <- ergm(mesa~edges)
mesamodel.02.gof <- gof(mesamodel.02~degree + esp + distance, nsim=10)
plot(mesamodel.02.gof)
```

For a good example of model exploration and fitting for the Add Health Friendship networks, see Goodreau, Kitts & Morris, *Demography* 2009.

For more technical details on the approach, see Hunter, Goodreau and Handcock *JASA* 2008

The computational algorithms in `ergm`

use MCMC to estimate the likelihood function when dyad dependent terms are in the model. Part of this process involves simulating a set of networks to use as a sample for approximating the unknown component of the likelihood: the \(k(\theta)\) term in the denominator.

When a model is not a good representation of the observed network, these simulated networks may be far enough away from the observed network that the estimation process is affected. In the worst case scenario, the simulated networks will be so different that the algorithm fails altogether.

For more detailed discussion of model degeneracy in the ERGM context, see the papers by Mark Handcock referenced below.

In the worst case scenario, we end up not being able to obtain coefficent estimates, so we can’t use the GOF function to identify how the model simulations deviate from the observed data. In this case, however, we can use the MCMC diagnostics to observe what is happening with the simulation algorithm, and this (plus some experience and intuition about the behavior of ergm-terms) can help us improve the model specification.

Below we show a simple example of a model that converges, and one that doesn’t, and how to use the MCMC diagnostics to improve a model that isn’t converging.

We will first consider a simulation where the algorithm works using the program defaults, and observe the behavior of the MCMC estimation algorithm using the `mcmc.diagnostics`

function.

`summary(flobusiness~edges+degree(1))`

```
edges degree1
15 3
```

`fit <- ergm(flobusiness~edges+degree(1))`

```
Iteration 1 of at most 20:
Convergence test P-value: 0e+00
The log-likelihood improved by 0.06583
Iteration 2 of at most 20:
Convergence test P-value: 4e-248
The log-likelihood improved by 0.01558
Iteration 3 of at most 20:
Convergence test P-value: 4.8e-49
The log-likelihood improved by 0.00338
Iteration 4 of at most 20:
Convergence test P-value: 3.9e-09
The log-likelihood improved by 0.0005943
Iteration 5 of at most 20:
Convergence test P-value: 6.6e-05
The log-likelihood improved by 0.0002866
Iteration 6 of at most 20:
Convergence test P-value: 3.5e-01
The log-likelihood improved by < 0.0001
Iteration 7 of at most 20:
Convergence test P-value: 2.9e-01
The log-likelihood improved by < 0.0001
Iteration 8 of at most 20:
Convergence test P-value: 1.3e-01
The log-likelihood improved by < 0.0001
Iteration 9 of at most 20:
Convergence test P-value: 2.6e-01
The log-likelihood improved by < 0.0001
Iteration 10 of at most 20:
Convergence test P-value: 8.6e-01
Convergence detected. Stopping.
The log-likelihood improved by < 0.0001
This model was fit using MCMC. To examine model diagnostics and check for degeneracy, use the mcmc.diagnostics() function.
```

`mcmc.diagnostics(fit)`

```
Sample statistics summary:
Iterations = 10000:1009900
Thinning interval = 100
Number of chains = 1
Sample size per chain = 10000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
edges -0.0147 3.74 0.0374 0.0408
degree1 0.0086 1.63 0.0163 0.0164
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
edges -7 -3 0 3 7
degree1 -3 -1 0 1 3
Are sample statistics significantly different from observed?
edges degree1 Overall (Chi^2)
diff. -0.0147 0.0086 NA
test stat. -0.3604 0.5243 0.2964
P-val. 0.7185 0.6001 0.8623
Sample statistics cross-correlations:
edges degree1
edges 1.0000 -0.4357
degree1 -0.4357 1.0000
Sample statistics auto-correlation:
Chain 1
edges degree1
Lag 0 1.000000 1.000000
Lag 100 0.087320 0.016873
Lag 200 0.008309 0.006461
Lag 300 -0.003954 0.008045
Lag 400 -0.012858 -0.023378
Lag 500 0.008232 -0.003913
Sample statistics burn-in diagnostic (Geweke):
Chain 1
Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5
edges degree1
-0.1701 1.7295
P-values (lower = worse):
edges degree1
0.86495 0.08372
```

```
Loading required package: latticeExtra
Loading required package: RColorBrewer
```

This is what you want to see in the MCMC diagnostics: the MCMC sample statistics are varying randomly around the observed values at each step (so the chain is “mixing” well) and the difference between the observed and simulated values of the sample statistics have a roughly bell-shaped distribution, centered at 0. The sawtooth pattern visible on the degree term deviation plot is due to the combination of discrete values and small range in the statistics: the observed number of degree 1 nodes is 3, and only a few discrete values are produced by the simulations. So the sawtooth pattern is is an inherent property of the statistic, not a problem with the fit.

There are many control parameters for the MCMC algorithm (“help(control.ergm)”), and we’ll play with some of these below. To see what the algorithm is doing at each step, you can drop the sampling interval down to 1:

```
fit <- ergm(flobusiness~edges+degree(1),
control=control.ergm(MCMC.interval=1)
```

This runs a version with every network returned, and might be useful if you are trying to debug a bad model fit.

Now let us look at a more problematic case, using a larger network:

```
data('faux.magnolia.high')
magnolia <- faux.magnolia.high
plot(magnolia, vertex.cex=.5)
```