Getting started with dist.structure

library(dist.structure)
library(algebraic.dist)

What dist.structure is

dist.structure is a small, principled extension of algebraic.dist. It lets you build and query random variables with internal structure: coherent reliability systems decomposed into components via a structure function.

Every object produced by dist.structure is also a dist (from algebraic.dist), so the full distribution algebra (mean, vcov, sampler, surv, cdf, density) works on it. On top of that, dist.structure exposes structural queries you cannot ask of a plain distribution: minimal path sets, system signature, Birnbaum importance, duality, and more.

This vignette is a five-minute tour. Other vignettes go deeper into specific topics.

Your first structured distribution

Start with a series system of three independent exponentials:

sys <- series_dist(list(
  exponential(0.5),
  exponential(0.3),
  exponential(0.2)
))
sys
#> <dist_structure: series_dist>
#>   components: 3

The object is a dist. Query it like any distribution:

algebraic.dist::surv(sys)(1)           # P(system survives past t = 1)
#> [1] 0.3678794
algebraic.dist::cdf(sys)(1)            # P(system fails by t = 1)
#> [1] 0.6321206
set.seed(1)
algebraic.dist::sampler(sys)(5)        # five system-lifetime samples
#> [1] 1.5103637 2.3632856 0.2914135 0.2795905 0.4901533

The series-of-exponentials identity sum(rates) = total rate holds: the system is Exp(sum(rates)). Confirm:

algebraic.dist::surv(sys)(1)
#> [1] 0.3678794
exp(-(0.5 + 0.3 + 0.2) * 1)
#> [1] 0.3678794

Structural queries

Now ask topology questions the base distribution algebra cannot answer:

ncomponents(sys)                       # 3 components
#> [1] 3
phi(sys, c(1, 1, 0))                   # system functions? No (series needs all)
#> [1] 0
phi(sys, c(1, 1, 1))                   # Yes
#> [1] 1
min_paths(sys)                         # the single path: all components
#> [[1]]
#> [1] 1 2 3
min_cuts(sys)                          # three singleton cuts
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 2
#> 
#> [[3]]
#> [1] 3
system_signature(sys)                  # (1, 0, 0): fails at first failure
#> [1] 1 0 0

The structure function phi takes a component state vector (0 = failed, 1 = functioning) and returns 1 if the system is functioning. For a series system, phi = AND: all components must function.

A richer topology: k-out-of-n

Series and parallel are special cases of k-out-of-n (functions if at least k of m components function). dist.structure has a shortcut:

# 2-out-of-3 of heterogeneous exponentials; closed-form specialization
kofn <- exp_kofn(k = 2, rates = c(1, 2, 3))
ncomponents(kofn)
#> [1] 3
algebraic.dist::surv(kofn)(1)
#> [1] 0.06988315
system_signature(kofn)                 # (0, 1, 0) for 2-of-3
#> [1] 0 1 0

The k-of-n system in dist.structure uses the :G convention: k functioning components are required for the system to function. Consequently kofn_dist(k = 1, ...) is parallel and kofn_dist(k = m, ...) is series.

Sampling, means, and composition

Every distribution generic composes naturally:

set.seed(42)
samples <- algebraic.dist::sampler(kofn)(5000)
mean(samples)
#> [1] 0.4510283

You can also substitute or compose components:

# Swap component 2 with a Weibull.
kofn2 <- substitute_component(kofn, j = 2,
  new_component = weibull_dist(shape = 2, scale = 1))
algebraic.dist::surv(kofn2)(1)
#> [1] 0.1584907

Importance measures

How much does each component matter? dist.structure provides four complementary importance measures:

# Structural importance: fraction of pivotal states for component 1.
structural_importance(kofn, j = 1)
#> [1] 0.5

# Birnbaum (reliability) importance: dR/dp_j at given component
# reliabilities.
birnbaum_importance(kofn, j = 1, p = 0.9)
#> [1] 0.18

# Criticality importance at time t = 0.5.
criticality_importance(kofn, j = 1, t = 0.5)
#> [1] 0.2548441

# Vesely-Fussell importance at time t = 0.5 (via minimal cut sets).
vesely_fussell_importance(kofn, j = 1, t = 0.5)
#> [1] 0.5480401

All four are different answers to “how important is component j?” and they agree for simple systems but diverge for complex ones. The importance measures vignette works through the distinctions.

Ecosystem

dist.structure is the shared topology and DGP layer for the reliability packages in the queelius ecosystem:

Any object from these packages that declares dist_structure in its class vector inherits the full protocol automatically: topology queries, importance measures, composition operators, and the full dist interface.