JANE()
JANE()
JANE is an R package for fitting latent space network cluster models using an expectation-maximization (EM) algorithm. It enables flexible modeling of unweighted or weighted network data, with or without noise edges, and supports both directed and undirected networks, with or without degree and strength heterogeneity. Designed to efficiently handle large networks, JANE allows users to explore latent structure, identify actor-centric communities, and simulate networks with customizable clustering and connectivity patterns.
Details on the methodology underlying the package can be found in Arakkal and Sewell (2025).
When the noise_weights
argument in JANE()
is set to FALSE
(default), a standard latent space cluster
model is fit to the supplied unweighted network.
model = "NDH"
Applicable for undirected networks and assumes no degree heterogeneity. However, in real-world networks, it is rare to find no degree heterogeneity, as most networks exhibit considerable variation in node connectivity. Below is an example that fits an NDH model, specifying the dimension of the latent space as 2 and the number of clusters as 3.
model = "RS"
Applicable for undirected networks and adjusts for degree heterogeneity through the inclusion of actor-specific sociality effects. Below is an example that fits an RS model, where the number of clusters is specified to be 3 and a range of dimensions between 2 and 5 for the latent space is considered.
model = "RSR"
Applicable for directed networks and adjusts for degree heterogeneity through the inclusion of actor-specific sender and receiver effects. Below is an example that fits an RSR model, where a range of cluster numbers between 2 and 10 and a range of dimensions between 2 and 5 for the latent space is considered.
With JANE()
, weighted networks are particularly
useful in settings where noise edges are present. In such settings, when
the noise_weights
argument in JANE()
is set to
TRUE
, a latent space hurdle model (LSHM) is fit. The LSHM
leverages information from both the propensity to form an edge and the
observed edge weights to probabilistically down-weight noisy edges,
while preserving edges that are structurally meaningful, subsequently
enhancing community detection.
If the supplied network is a weighted network, in the absence of
noise it can be shown that the latent positions, regression parameters
associated with the logistic regression model, finite mixture model
parameters, and actor-specific cluster membership probabilities can all
be estimated separately from the generalized linear model parameters for
the edge weights. Thus, when noise_weights = FALSE
,
JANE()
will simply dichotomize the supplied weighted
network based on a threshold value of 0 and fit a standard latent space
cluster model.
You can fit the "NDH"
, "RS"
, or
"RSR"
models using any of the supported weight
distributions described below. When working with weighted networks, the
"RS"
and "RSR"
models account not only for
degree heterogeneity but also for strength heterogeneity.
family = "poisson"
Use this option for count-weighted networks. This setting models observed edge weights using a zero-truncated Poisson (ZTP) distribution:
guess_noise_weights
argumentfamily = "lognormal"
Use this option when edge weights are positive, continuous values. This setting models observed edge weights using a log-normal distribution:
guess_noise_weights
argumentfamily = "bernoulli"
(default)This setting is used for unweighted (binary) networks. In this case:
guess_noise_weights
is interpreted as the expected
proportion of all edges that are likely to be noiseguess_noise_weights
If guess_noise_weights is left as NULL (the default),
JANE()
will automatically set this value based on the
family
argument:
family %in% c("lognormal", "poisson")
, the 1st
percentile of the non-zero edge weights is used (log-transformed weights
for lognormal
)family = "bernoulli"
, a default value of 0.01
(i.e., 1%) is used, representing the assumed proportion of noisy
edgesJANE includes a built-in function,
sim_A()
, for simulating networks with varying clustering
and connectivity patterns. The function returns a list, which includes
the following two components:
A
: A sparse adjacency matrix of class ‘dgCMatrix’
representing the “true” underlying unweighted network with no noise
edgesW
: A sparse adjacency matrix of class ‘dgCMatrix’
representing the unweighted or weighted network, with or without noise
edgesThe relationship between A
and W
depends on
the values specified for family
and
noise_weights_prob
:
family = "bernoulli"
and
noise_weights_prob = 0
, then A
and
W
are identical (i.e., no noise simulated)family = "bernoulli"
and
noise_weights_prob > 0
, then W
contains
added noise edges, while A
represents the noise-free
underlying networkfamily %in% c("poisson", "lognormal")
and
noise_weights_prob = 0
, then W
contains
continuous or count-valued edge weights, but dichotomizing
W
at 0 will reproduce A
family %in% c("poisson", "lognormal")
and
noise_weights_prob > 0
, then W
is a noisy,
weighted network with additional spurious edges, and A
remains the true underlying structureBelow is an example to simulate an unweighted, undirected, noise-free network with a 2-dimensional latent space and 3 clusters, assuming no degree heterogeneity.
# Specify the 3 x 2 matrix containing the 1 x 2 mean vectors of the 3 bivariate normals
mus <- matrix(c(-1,-1, # Mean vector 1
1,-1, # Mean vector 2
1, 1), # Mean vector 3
nrow = 3,
ncol = 2,
byrow = TRUE)
# Specify the 2 x 2 x 3 array containing the 2 x 2 precision matrices of the 3 bivariate normals
omegas <- array(c(diag(rep(7,2)), # Precision matrix 1
diag(rep(7,2)), # Precision matrix 2
diag(rep(7,2))), # Precision matrix 3
dim = c(2,2,3))
# Simulate a network
sim_A(N = 100L,
model = "NDH",
mus = mus,
omegas = omegas,
p = rep(1/3, 3),
params_LR = list(beta0 = 1.0),
remove_isolates = TRUE)
The parameter beta0
above can be tuned to achieve a
desired network density:
# Specify the 3 x 2 matrix containing the 1 x 2 mean vectors of the 3 bivariate normals
mus <- matrix(c(-1,-1, # Mean vector 1
1,-1, # Mean vector 2
1, 1), # Mean vector 3
nrow = 3,
ncol = 2,
byrow = TRUE)
# Specify the 2 x 2 x 3 array containing the 2 x 2 precision matrices of the 3 bivariate normals
omegas <- array(c(diag(rep(7,2)), # Precision matrix 1
diag(rep(7,2)), # Precision matrix 2
diag(rep(7,2))), # Precision matrix 3
dim = c(2,2,3))
desired_density <- 0.1 # Target network density
min_density <- desired_density * 0.99 # Lower bound for acceptable density
max_density <- desired_density * 1.01 # Upper bound for acceptable density
n_act <- 100L # Number of actors in the network
density <- Inf # Initialize density to enter while loop
beta0 <- 0.5 # Initial value for intercept parameter
n_while_loop <- 0 # Counter for outer loop iterations
max_its <- 100 # Maximum number of iterations
change_beta0 <- 0.1 # Amount to adjust beta0 by
# Adjust beta0 until simulated network has the approximate desired density
while(! (density >= min_density & density <= max_density) ){
if(n_while_loop>max_its){
break
}
n_retry_isolate <- 0
retry_isolate <- T
# Retry until a network with no isolates is generated (this while loop is optional)
while(retry_isolate){
sim_data <- sim_A(N = n_act,
model = "NDH",
mus = mus,
omegas = omegas,
p = rep(1/3, 3),
params_LR = list(beta0 = beta0),
remove_isolates = TRUE)
n_retry_isolate <- n_retry_isolate + 1
# Accept network if no isolates remain, or if retried more than 10 times at the same beta0
if(nrow(sim_data$A) == n_act | n_retry_isolate>10){
retry_isolate <- F
}
}
# Compute network density
density <- igraph::graph.density(
igraph::graph_from_adjacency_matrix(sim_data$A, mode = "undirected")
)
# Adjust beta0 based on density feedback
if (density > max_density) {
beta0 <- beta0 - change_beta0
}
if (density < min_density) {
beta0 <- beta0 + change_beta0
}
n_while_loop <- n_while_loop + 1
}
A <- sim_data$A # Final simulated adjacency matrix
igraph::graph.density(igraph::graph_from_adjacency_matrix(A, mode = "undirected")) # Verify density
#> [1] 0.1
Below is an example that simulates a directed, weighted network with noise, degree and strength heterogeneity, a 2-dimensional latent space, and 3 clusters.
# Specify the 3 x 2 matrix containing the 1 x 2 mean vectors of the 3 bivariate normals
mus <- matrix(c(-1,-1, # Mean vector 1
1,-1, # Mean vector 2
1, 1), # Mean vector 3
nrow = 3,
ncol = 2,
byrow = TRUE)
# Specify the 2 x 2 x 3 array containing the 2 x 2 precision matrices of the 3 bivariate normals
omegas <- array(c(diag(rep(7,2)), # Precision matrix 1
diag(rep(7,2)), # Precision matrix 2
diag(rep(7,2))), # Precision matrix 3
dim = c(2,2,3))
# Simulate a network
sim_A(N = 100L,
model = "RSR",
family = "poisson",
mus = mus,
omegas = omegas,
p = rep(1/3, 3),
params_LR = list(beta0 = 1),
params_weights = list(beta0 = 2),
noise_weights_prob = 0.1,
mean_noise_weights = 1,
remove_isolates = TRUE)
JANE()
allows for the following model selection criteria
to choose the number of clusters and the best initialization of the EM
algorithm (smaller values are favored):
"Total_BIC"
(i.e., BIC): Sum of \(BIC_{model}\) and \(BIC_{mbc}\), where \(BIC_{model}\) is the BIC computed from
logistic regression or Hurdle model component and \(BIC_{mbc}\) is the BIC computed from model
based clustering component. This is the model selection criterion
proposed by Handcock, Raftery, and Tantrum
(2007)"Total_ICL"
(i.e., BICL): (default) sum of \(BIC_{model}\) and \(ICL_{mbc}\), this criterion is similar to
"Total_BIC"
, but uses the integrated completed likelihood
(ICL) for the model based clustering component, which tends to favor
more well-separated clustersBased on simulation studies, Biernacki, Celeux, and Govaert (2000) recommends that when the interest in mixture modeling is cluster analysis, instead of density estimation, the \(ICL_{mbc}\) criterion should be favored over the \(BIC_{mbc}\) criterion, as the \(BIC_{mbc}\) criterion tends to overestimate the number of clusters. The \(BIC_{mbc}\) criterion is designed to choose the number of components in a mixture model rather than the number of clusters.
Warning: It is not certain whether it is appropriate to use the model selection criterion above to select the dimension of the latent space \(D\).
Below is an example that fits an RSR model, where a range of cluster
numbers between 2 and 10 is considered for a 2-dimensional latent space
and "Total_BIC"
is used to select the optimal number of
clusters and best initialization of the EM algorithm.
JANE(A = network_adjacency_matrix,
D = 2,
K = 2:10,
model = "RSR",
noise_weights = FALSE,
control = list(IC_selection = "Total_BIC"))
Below is an example that fits an RSR model for a 2-dimensional latent
space with 3 clusters and "Total_ICL"
is used to select the
optimal initialization of the EM algorithm from 10 unique starts. Note:
the number of starts for the EM algorithm is controlled through the
control
argument by supplying a value for
n_start
.
JANE()
has three initialization strategies to generate
starting values for the EM algorithm, controlled through the
initialization
argument:
"GNN"
: uses a type of graphical neural network approach
to generate initial values (default). Details of the graphical neural
network approach can be found in Arakkal and
Sewell (2025)"random"
: uses random initial values"JANE.initial_values"
representing initial values for the EM algorithm. See
?specify_initial_values
for details on how to specify
initial valuesBelow is an example where starting values are supplied to
JANE()
using specify_initial_values()
.
# Specify starting values
D <- 3
K <- 5
N <- nrow(sim_data$A)
n_interior_knots <- 5L
U <- matrix(stats::rnorm(N*D), nrow = N, ncol = D)
omegas <- stats::rWishart(n = K, df = D+1, Sigma = diag(D))
mus <- matrix(stats::rnorm(K*D), nrow = K, ncol = D)
p <- extraDistr::rdirichlet(n = 1, rep(3,K))[1,]
Z <- extraDistr::rdirichlet(n = N, alpha = rep(1, K))
beta <- stats::rnorm(n = 1 + 2*(1 + n_interior_knots))
my_starting_values <- specify_initial_values(A = network_adjacency_matrix,
D = D,
K = K,
model = "RSR",
n_interior_knots = n_interior_knots,
U = U,
omegas = omegas,
mus = mus,
p = p,
Z = Z,
beta = beta)
# Run JANE using my_starting_values (no need to specify D and K as function will
# determine those values from my_starting_values)
JANE(A = network_adjacency_matrix,
initialization = my_starting_values,
model = "RSR",
noise_weights = FALSE)
The prior distributions are specified as follows:
The same prior is used for \(k = 1, \ldots, K\):
\[ \boldsymbol{\Omega}_k \sim \text{Wishart}(c, \boldsymbol{G}^{-1}) \]
\[ \boldsymbol{\mu}_k \mid \boldsymbol{\Omega}_k \sim \text{MVN}(\boldsymbol{a}, (b \boldsymbol{\Omega}_k)^{-1}) \]
For the current implementation, we require that all elements of the
nu
vector be \(\geq 1\) to
prevent negative mixture weights for empty clusters:
\[ \boldsymbol{p} \sim \text{Dirichlet}(\nu_1, \ldots, \nu_K) \]
\[ \boldsymbol{\beta}_{LR} \sim \text{MVN}(\boldsymbol{e}, \boldsymbol{F}^{-1}) \]
\[ q \sim \text{Beta}(h, l) \]
\[ \boldsymbol{\beta}_{GLM} \sim \text{MVN}(\boldsymbol{e}_2, \boldsymbol{F}_2^{-1}) \]
\[ \tau^2_{\text{weights}} \sim \text{Gamma}\left(\frac{m_1}{2}, \frac{o_1}{2}\right) \]
\[ \boldsymbol{\beta}_{GLM} \mid \tau^2_{\text{weights}} \sim \text{MVN}\left(\boldsymbol{e}_2, (\tau^2_{\text{weights}} \boldsymbol{F}_2)^{-1}\right) \]
\[ \tau^2_{\text{noise weights}} \sim \text{Gamma}\left(\frac{m_2}{2}, \frac{o_2}{2}\right) \]
where, \(D\) is the dimension of the
latent space, \(K\) is the number of
clusters, \[dim(\boldsymbol{\beta}_{LR}) =
dim(\boldsymbol{\beta}_{GLM}) = \begin{cases}1 &
\text{"NDH''}\\ \zeta+2 & \text{"RS''} \\
2\zeta+3 & \text{"RSR''},\end{cases}\] and \(\zeta\) is the number of interior knots
used in fitting a natural cubic spline for degree heterogeneity (and
connection strength heterogeneity if working with weighted network)
models (i.e., "RS"
and "RSR"
only).
# Specify prior hyperparameters
D <- 3
K <- 5
n_interior_knots <- 5L
a <- rep(1, D)
b <- 3
c <- 4
G <- 10*diag(D)
nu <- rep(2, K)
e <- rep(0.5, 1 + (n_interior_knots + 1))
f <- diag(c(0.1, rep(0.5, n_interior_knots + 1)))
my_prior_hyperparameters <- specify_priors(D = D,
K = K,
model = "RS",
n_interior_knots = n_interior_knots,
a = a,
b = b,
c = c,
G = G,
nu = nu,
e = e,
f = f)
# Run JANE using supplied prior hyperparameters (no need to specify D and K
# as function will determine those values from my_prior_hyperparameters)
JANE(A = network_adjacency_matrix,
initialization = "GNN",
model = "RS",
noise_weights = FALSE,
control = list(priors = my_prior_hyperparameters))
# Specify first set of prior hyperparameters
D <- 3
K <- 5
n_interior_knots <- 5L
a <- rep(1, D)
b <- 3
c <- 4
G <- 10*diag(D)
nu <- rep(2, K)
e <- rep(0.5, 1 + (n_interior_knots + 1))
f <- diag(c(0.1, rep(0.5, n_interior_knots + 1)))
my_prior_hyperparameters_1 <- specify_priors(D = D,
K = K,
model = "RS",
n_interior_knots = n_interior_knots,
a = a,
b = b,
c = c,
G = G,
nu = nu,
e = e,
f = f)
# Specify second set of prior hyperparameters
D <- 2
K <- 3
n_interior_knots <- 5L
a <- rep(1, D)
b <- 3
c <- 4
G <- 10*diag(D)
nu <- rep(2, K)
e <- rep(0.5, 1 + (n_interior_knots + 1))
f <- diag(c(0.1, rep(0.5, n_interior_knots + 1)))
my_prior_hyperparameters_2 <- specify_priors(D = D,
K = K,
model = "RS",
n_interior_knots = n_interior_knots,
a = a,
b = b,
c = c,
G = G,
nu = nu,
e = e,
f = f)
# Create nested list
my_prior_hyperparameters <- list(my_prior_hyperparameters_1,
my_prior_hyperparameters_2)
# Run JANE using supplied prior hyperparameters (no need to specify D and K
# as function will determine those values from my_prior_hyperparameters)
JANE(A = network_adjacency_matrix,
initialization = "GNN",
model = "RS",
noise_weights = FALSE,
control = list(priors = my_prior_hyperparameters))
Unevaluated calls using quote()
can be supplied as
values for specific hyperparameters. This is particularly useful when
running JANE()
for multiple combinations of \(K\) and \(D\).
# Specify prior hyperparameters as unevaluated calls
n_interior_knots <- 5L
e <- rep(0.5, 1 + (n_interior_knots + 1))
f <- diag(c(0.1, rep(0.5, n_interior_knots + 1)))
my_prior_hyperparameters <- specify_priors(model = "RS",
n_interior_knots = n_interior_knots,
a = quote(rep(1, D)),
b = b,
c = quote(D + 1),
G = quote(10*diag(D)),
nu = quote(rep(2, K)),
e = e,
f = f)
# Run JANE using supplied prior hyperparameters
JANE(A = network_adjacency_matrix,
D = 2:5,
K = 2:10,
initialization = "GNN",
model = "RS",
noise_weights = FALSE,
control = list(priors = my_prior_hyperparameters))
JANE()
You can speed up fitting models for multiple combinations of cluster
number K
, latent space dimension D
, and number
of initializations of the EM algorithm n_start
, by running
JANE()
in parallel. This leverages the future
and future.apply
packages to distribute computation across
cores.
Below is an example using 5 parallel workers to run
JANE()
with parallel backend enabled, followed by resetting
to sequential processing.
JANE()
When working with large sparse networks, the case-control
approximation implemented in JANE()
can reduce
computational cost. This approach leverages approximation methods
developed by Raftery et al. (2012). With
the case-control implementation, the likelihood includes all edges
(where most of the information lies) but uses a Monte Carlo
approximation of terms involving the unconnected dyads of the network
(i.e., non-edges). Thus, the cost of evaluating the likelihood becomes
linear, rather than quadratic, in \(N\).
Below is an example running JANE()
with the case-control
approach, sampling 20 controls (i.e., non-edges) per actor.
Once you have fit a model using the JANE()
function, you
can inspect and analyze the results using several built-in S3 methods.
These methods work with the JANE
S3 class object returned
from a model fit.
For the examples in this section, we assume that res
is
the object returned by JANE()
.
print()
The print()
method gives a summary of the optimal model
fit and available components.
This displays:
res
summary()
Use the summary()
method to extract and display detailed
information about the estimated model.
You can also compare estimated cluster assignments to known true
assigments using the true_labels
argument, or inspect
starting values using initial_values = TRUE
.
Output includes:
true_labels
is
supplied)noise_weights = TRUE
. This
includes the estimated conditional probability that a specific edge is a
noise or a non-noise edge, and their corresponding noise edge cluster
assignment based on a hard clustering rule of \(\{h | \hat{Z}^{W}_{eh} = max(\hat{Z}^{W}_{e1},
\hat{Z}^{W}_{e2})\}\) for \(e =
1\dots,|E|\), where \(\hat{Z}^{W}_{e1}\) and \(\hat{Z}^{W}_{e2}\) are the estimated
conditional probabilities that the \(e^{th}\) edge is a non-noise and noise
edge, respectively, and \(|E|\)
represents the total number of edges in the network (for undirected
networks, only the upper diagonal edges are retained). Noise edge
cluster labels are defined as: 1 = non-noise edge and 2 = noise
edgeplot()
The plot()
method provides visualizations depending on
the type
argument.
type = "lsnc"
)type = "misclassified"
)true_labels_vec
in this
example)type = "uncertainty"
)type = "trace_plot"
)alpha_edge
, alpha_node
, zoom
,
swap_axes
, rotation_angle
,
cluster_cols
, etc.remove_noise_edges = TRUE
removes noise edges based on
noise edge hard cluster assignment (only applicable if
JANE()
was run with noise_weights = TRUE
)