README

WE STRONGLY RECOMMEND TO INSTALL SIMER ON Microsoft R Open (https://mran.microsoft.com/download/).

Installation

install.packages("simer")

devtools::install_github("xiaolei-lab/SIMER")

Data Preparation

Genotype

Genotype data should be Numeric format (either m * n or n * m is acceptable, m is the number of SNPs, n is the number of individuals). Other genotype data, such as PLINK Binary format (details see http://zzz.bwh.harvard.edu/plink/data.shtml#bed), VCF, or Hapmap can be converted to Numeric format using MVP.Data function in the rMVP (https://github.com/xiaolei-lab/rMVP).

Genetic map

back to top
A genetic map is necessary in SIMER. The first column is the SNP name, the second column is the Chromosome ID, the third column is physical position, the fourth column is REF, and the fifth column is ALT. This will be used to generate annotation data, genotype data, and phenotype data.

Pedigree

SNP	Chrom	BP	REF	ALT
1_10673082	1	10673082	T	C
1_10723065	1	10723065	A	G
1_11407894	1	11407894	A	G
1_11426075	1	11426075	T	C
1_13996200	1	13996200	T	C
1_14638936	1	14638936	T	C

back to top
SIMER supports user designed pedigree to control mating process. User designed pedigree is useful only in userped reproduction. The first column is sample id, the second column is paternal id, and the third column is maternal id. Please make sure that paternal id and maternal id can match to genotype data.

Data Input

Basic

Index	Sire	Dam
41	1	11
42	1	11
43	1	11
44	1	11
45	2	12
46	2	12

back to top
At least users should prepare two datasets: genetic map and genotype data.

genetic map, SNP map information, the first column is SNP name, the second column is Chromosome ID, the third column is physical position, the fourth column is REF, and the fifth column is ALT.
genotype data, Numeric format (either m * n or n * m is acceptable, m is the number of SNPs, n is the number of individuals)

pop.map <- read.table("map.txt", head = TRUE)
pop.geno <- read.table("genotype.txt")

Optional

back to top
The mating process can be designed by user-designed pedigree.

pedigree, pedigree information, the first column is sample id, the second column is paternal id, and the third column is maternal id. Note that the individuals in the pedigree do not need to be sorted by the date of birth, and the missing value can be replaced by NA or 0.

userped <- read.table("userped.txt", header = TRUE)

Quick Start

All simulation processes can be divided into two steps: 1) generation of simulation parameters; 2) run simulation process.

Quick Start for Population Simulation

# Generate all simulation parameters
SP <- param.simer(out = "simer")

# Run Simer
SP <- simer(SP)

Quick Start for Genotype Simulation

# Generate annotation simulation parameters
SP <- param.annot(species = "pig")
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)

Quick Start for Phenotype Simulation

# Generate annotation simulation parameters
SP <- param.annot(species = "pig")
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Genotype Simulation

Genotype data in SIMER is generated randomly or through an external genotype matrix. Chromosome crossovers and base mutations depend on block information and recombination information of Annotation data.

Gallery of genotype simulation parameters

Generate an external or species-specific or random genetic map

Paramater	Default	Options	Description
pop.geno	NULL	big.matrix or matrix (either m * n or n * m is acceptable, m is the number of SNPs, n is the number of individuals)	the genotype data.
inrows	1	1 or 2	“1”: one-row genotype represents an individual; “2”: two-row genotype represents an individual.
pop.marker	1e4	num	the number of markers.
pop.ind	1e2	num	the number of individuals in the base population.
prob	NULL	num vector	the genotype code probability.
rate.mut	list(qtn = 1e-8, snp = 1e-8)	list	the mutation rate of the genotype data.
cld	FALSE	TRUE or FALSE	whether to generate a complete LD genotype data when “inrows == 2”.

Paramater	Default	Options	Description
pop.map	NULL	data.frame	the map data with annotation information.
species	NULL	character	the species of genetic map, which can be “arabidopsis”, “cattle”, “chicken”, “dog”, “horse”, “human”, “maize”, “mice”, “pig”, and “rice”.
pop.marker	1e4	num	the number of markers.
num.chr	18	num	the number of chromosomes.
len.chr	1.5e8	num	the length of chromosomes.
recom.spot	FALSE	TRUE or FALSE	whether to generate recombination events.
range.hot	4:6	num vector	the recombination times range in the hot spot.
range.cold	1:5	num vector	the recombination times range in the cold spot.

# Real genotypic map
mapPath <- system.file("extdata", "06map", "pig_map.txt", package = "simer")
pop.map <- read.table(mapPath, header = TRUE)

# Generate annotation simulation parameters
SP <- param.annot(pop.map = pop.map)

# Run annotation simulation
SP <- annotation(SP)

Users can also use the inner real genetic map with species, which can be “arabidopsis”, “cattle”, “chicken”, “dog”, “horse”, “human”, “maize”, “mice”, “pig”, and “rice”.

# Generate annotation simulation parameters
SP <- param.annot(species = "pig")

# Run annotation simulation
SP <- annotation(SP)

Users can generate a random genetic map with pop.marker, num.chr, and len.chr.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, num.chr = 18, len.chr = 1.5e8)

# Run annotation simulation
SP <- annotation(SP)

Generate an external or species-specific or random genotype matrix

Users can use real genotype data with specific genetic structure for subsequent simulation.

# Create a genotype matrix
# pop.geno <- read.table("genotype.txt")
# pop.geno <- bigmemory::attach.big.matrix("genotype.geno.desc")
pop.geno <- matrix(c(0, 1, 2, 0), nrow = 1e2, ncol = 1e4, byrow = TRUE)

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.geno = pop.geno)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)

Users can also generate genotype matrix with the inner real genetic map with species, which can be “arabidopsis”, “cattle”, “chicken”, “dog”, “horse”, “human”, “maize”, “mice”, “pig”, and “rice”.

# Generate annotation simulation parameters
SP <- param.annot(species = "pig")
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)

Users can also specify pop.marker and pop.ind to generate random genotype data.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)

Generate a genotype matrix with complete linkage disequilibrium

Users can generate a genotype matrix with complete linkage disequilibrium by inrows = 2 and cld = TRUE.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2, inrows = 2, cld = TRUE)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)

Add chromosome crossovers and mutations to genotype matrix

With annotation data, chromosome crossovers and mutations can be added to a genotype matrix.

# Generate annotation simulation parameters
# If recom.spot = TRUE, chromosome crossovers will be added to genotype matrix
SP <- param.annot(pop.marker = 1e4, recom.spot = TRUE)
# Generate genotype simulation parameters
# Base mutation rate of QTN and SNP are 1e8
SP <- param.geno(SP = SP, pop.ind = 1e2, rate.mut = list(qtn = 1e-8, snp = 1e-8))

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)

Note that recombination only exists in meiosis. Therefore, some reproduction methods such as clone do not have recombination processes. Users can set recom.spot = FALSE to add only mutations to the genotype matrix.

# Generate annotation simulation parameters
# If recom.spot = FALSE, chromosome crossovers will not be added to genotype matrix
SP <- param.annot(pop.marker = 1e4, recom.spot = FALSE)
# Generate genotype simulation parameters
# Base mutation rate of QTN and SNP are 1e8
SP <- param.geno(SP = SP, pop.ind = 1e2, rate.mut = list(qtn = 1e-8, snp = 1e-8))

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)

Phenotype Simulation

Phenotype data in SIMER is generated according to different models, which include:
(1) Single-Trait Model
(2) Multiple-Trait Model
(3) Repeated Record Model
(4) Genetic Effect Model (Additive effect, Dominant effect, and Genetic-Genetic interaction effect)
(5) Genetic Model with Varied QTN Effect Distributions (QTN effect distribution: Normal distribution, Geometric distribution, Gamma distribution, Beta distribution, and their combination)
(6) Linear Mixed Model (Fixed effect, Covariate, Environmental Random effect, Genetic Random effect, Genetic-Environmental interaction effect, and Environmental-Environmental interaction effect)

Gallery of phenotype simulation parameters

Generate phenotype using an external or species-specific or random genotype matrix

Paramater	Default	Options	Description
pop	NULL	data.frame	the population information containing environmental factors and other effects.
pop.ind	100	num	the number of individuals in the base population.
pop.rep	1	num	the repeated times of repeated records.
pop.rep.bal	TRUE	TRUE or FALSE	whether repeated records are balanced.
pop.env	NULL	list	a list of environmental factors setting.
phe.type	list(tr1 = “continuous”)	list	a list of phenotype types.
phe.model	list(tr1 = “T1 = A + E”)	list	a list of genetic model of phenotype such as “T1 = A + E”.
phe.h2A	list(tr1 = 0.3)	list	a list of additive heritability.
phe.h2D	list(tr1 = 0.1)	list	a list of dominant heritability.
phe.h2GxG	list(tr1 = 0.1)	list	a list of GxG interaction heritability.
phe.h2GxE	list(tr1 = 0.1)	list	a list of GxE interaction heritability.
phe.h2PE	list(tr1 = 0.1)	list	a list of permanent environmental heritability.
phe.var	NULL	list	a list of phenotype variance.
phe.corA	diag(nTrait)	matrix	the additive genetic correlation matrix.
phe.corD	diag(nTrait)	matrix	the dominant genetic correlation matrix.
phe.corGxG	list(diag(nTrait))	list	a list of the GxG genetic correlation matrix.
phe.corPE	diag(nTrait)	matrix	the permanent environmental correlation matrix.
phe.corE	diag(nTrait)	matrix	the residual correlation matrix.

Paramater	Default	Options	Description
pop.map	NULL	data.frame	the map data with annotation information.
qtn.model	“A”	character	the genetic model of QTN such as “A + D”.
qtn.index	NULL	list	the QTN index for each trait.
qtn.num	list(tr1 = 10)	list	the QTN number for (each group in) each trait.
qtn.dist	list(tr1 = “norm”)	list	the QTN distribution containing “norm”, “geom”, “gamma” or “beta”.
qtn.var	list(tr1 = 0.01)	list	the variances for normal distribution.
qtn.prob	list(tr1 = 0.5)	list	the probability of success for geometric distribution.
qtn.shape	list(tr1 = 1)	list	the shape parameter for gamma distribution.
qtn.scale	list(tr1 = 1)	list	the scale parameter for gamma distribution.
qtn.shape1	list(tr1 = 1)	list	the shape1 parameter for beta distribution.
qtn.shape2	list(tr1 = 1)	list	the shape2 parameter for beta distribution.
qtn.ncp	list(tr1 = 0)	list	the ncp parameter for beta distribution.
qtn.spot	FALSE	TRUE or FALSE	the QTN distribution probability in each block.
len.block	5e7	num	the block length.
maf	NULL	num	the maf threshold, markers less than this threshold will be exclude.

Users can use real genotype data with specific genetic structure to generate phenotype.

# Create a genotype matrix
# pop.geno <- read.table("genotype.txt")
# pop.geno <- bigmemory::attach.big.matrix("genotype.geno.desc")
pop.geno <- matrix(c(0, 1, 2, 0), nrow = 1e2, ncol = 1e4, byrow = TRUE)

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.geno = pop.geno)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

# Generate annotation simulation parameters
SP <- param.annot(species = "pig")
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate continuous phenotype

SIMER generates continuous phenotypes by default. Continuous phenotype simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.type = list(tr1 = "continuous"),
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Multiple-trait simulation of continuous phenotype is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.type = list(tr1 = "continuous", tr2 = "continuous"),
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate case-control phenotype

SIMER generates case-control phenotypes by phe.type. phe.type consists of the variable names and their percentages. Case-control phenotype simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.type = list(tr1 = list(case = 0.01, control = 0.99)), # "T1" (Trait 1) consists of 1% case and 99% control
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Multiple-trait simulation of case-control phenotype is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.type = list(
    tr1 = list(case = 0.01, control = 0.99), # "T1" (Trait 1) consists of 1% case and 99% control
    tr2 = list(case = 0.01, control = 0.99)  # "T2" (Trait 2) consists of 1% case and 99% control
   ),
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate categorical phenotype

SIMER generates categorical phenotypes by phe.type. phe.type consists of the variable names and their percentages. Categorical phenotype simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.type = list(tr1 = list(low = 0.3, medium = 0.4, high = 0.3)), # "T1" (Trait 1) consists of 30% low, 40% medium, and 30% high
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Multiple-trait simulation of categorical phenotype is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.type = list(
    tr1 = list(low = 0.3, medium = 0.4, high = 0.3), # "T1" (Trait 1) consists of 30% low, 40% medium, and 30% high
    tr2 = list(low = 0.3, medium = 0.4, high = 0.3)  # "T2" (Trait 2) consists of 30% low, 40% medium, and 30% high
   ),
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype using A model

In an “A” model, SIMER only considers an Additive effect as a genetic effect. Users should prepare Additive QTN effect in the Annotation data to generate an Additive Individual effect. An Additive single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

In the multiple-trait simulation, SIMER builds accurate Additive genetic correlation among multiple traits. An Additive multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype using AD model

In an “AD” model, SIMER considers Additive effect and Dominant effect as genetic effect. Users should prepare Additive QTN effect and Dominant QTN effect in the Annotation data to generate an Additive Individual effect and Dominant Individual effect. Additive and Dominant single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A + D") # Additive effect and Dominant effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + D + E"), # "T1" (Trait 1) consists of Additive effect, Dominant effect, and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3),
  phe.h2D = list(tr1 = 0.1)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

In multiple-trait simulation, SIMER builds accurate Additive genetic correlation and accurate Dominant genetic correlation among multiple traits. An Additive and Dominant multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A + D") # Additive effect and Dominant effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + D + E", # "T1" (Trait 1) consists of Additive effect, Dominant effect, and Residual effect
    tr2 = "T2 = A + D + E"  # "T2" (Trait 2) consists of Additive effect, Dominant effect, and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.h2D = list(tr1 = 0.1, tr2 = 0.1),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2), # Additive genetic correlation
  phe.corD = matrix(c(1, 0.5, 0.5, 1), 2, 2)  # Dominant genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype using GxG model

In a “GxG” model, SIMER considers Genetic-Genetic effect as a genetic effect. Users should prepare Genetic-Genetic QTN effect in the Annotation data to generate Genetic-Genetic Individual effect. An example of Additive-Dominant interaction in single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A + D + A:D") # Additive effect, Dominant effect, and Additive-Dominant interaction effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + D + A:D + E"), # "T1" (Trait 1) consists of Additive effect, Dominant effect, Additive-Dominant interaction effect, and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3),
  phe.h2D = list(tr1 = 0.1),
  phe.h2GxG = list(tr1 = list("A:D" = 0.1))
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

In the multiple-trait simulation, SIMER builds accurate Genetic-Genetic interaction correlation among multiple traits. An example of Additive-Dominant interaction in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A + D + A:D") # Additive effect, Dominant effect, and Additive-Dominant interaction effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + D + A:D + E", # "T1" (Trait 1) consists of Additive effect, Dominant effect, Additive-Dominant interaction effect, and Residual effect
    tr2 = "T2 = A + D + A:D + E"  # "T2" (Trait 2) consists of Additive effect, Dominant effect, Additive-Dominant interaction effect, and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.h2D = list(tr1 = 0.1, tr2 = 0.1),
  phe.h2GxG = list(tr1 = list("A:D" = 0.1), tr2 = list("A:D" = 0.1)),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2),                 # Additive genetic correlation
  phe.corD = matrix(c(1, 0.5, 0.5, 1), 2, 2),                 # Dominant genetic correlation
  phe.corGxG = list("A:D" = matrix(c(1, 0.5, 0.5, 1), 2, 2))  # Additive-Dominant interaction genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype using Repeated Record model

In the Repeated Record model, SIMER adds a PE (Permanent Environmental) effect to the phenotype. The number of repeated records can be set by pop.rep. In the meantime, pop.rep.bal can be used to determine whether repeated records are balanced. The Repeated Record in a single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  pop.rep = 2,                          # The number of repeated records is 2
  pop.rep.bal = TRUE,                   # Repeated records are balanced
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

In the multiple-trait simulation, SIMER builds accurate Permanent Environmental correlation among multiple traits. Repeated Record in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  pop.rep = 2,          # The number of repeated records is 2
  pop.rep.bal = TRUE,   # Repeated records are balanced
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2), # Additive genetic correlation
  phe.corPE = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Permanent Environmental correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype controlled by QTNs subject to Normal distribution

Normal distribution is the most common QTN effect distribution. Phenotype controlled by QTNs subject to Normal distribution in single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "norm"),
  qtn.var = list(tr1 = 0.01)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Phenotype controlled by QTNs subject to Normal distribution in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10, tr2 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "norm", tr2 = "norm"),
  qtn.var = list(tr1 = 0.01, tr2 = 0.01)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype controlled by QTNs subject to Geometric distribution

Geometric distribution is the probability of success for the first time obtained only after K trials among the N Bernoulli trials. Geometric distribution can be used as a QTN effect distribution. Phenotype controlled by QTNs subject to Geometric distribution in single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "geom"),
  qtn.prob = list(tr1 = 0.5)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Phenotype controlled by QTNs subject to Geometric distribution in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10, tr2 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "geom", tr2 = "geom"),
  qtn.prob = list(tr1 = 0.5, tr2 = 0.5)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype controlled by QTNs subject to Gamma distribution

Gamma distribution is the sum of N independent exponential random variables. Note that Exponential distribution is a special form of Gamma distribution when qtn.shape = 1 and qtn.scale = 1. Phenotype controlled by QTNs subject to Gamma distribution in single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "gamma"),
  qtn.shape = list(tr1 = 1),
  qtn.scale = list(tr1 = 1)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Phenotype controlled by QTNs subject to Gamma distribution in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10, tr2 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "gamma", tr2 = "gamma"),
  qtn.shape = list(tr1 = 1, tr2 = 1),
  qtn.scale = list(tr1 = 1, tr2 = 1)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype controlled by QTNs subject to Beta distribution

Beta distribution is a density function of conjugate prior distribution as Bernoulli distribution and Binomial distribution. Phenotype controlled by QTNs subject to the Beta distribution in single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "beta"),
  qtn.shape1 = list(tr1 = 1),
  qtn.shape2 = list(tr1 = 1),
  qtn.ncp = list(tr1 = 0)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Phenotype controlled by QTNs subject to Beta distribution in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4,
  qtn.num = list(tr1 = 10, tr2 = 10),
  qtn.model = "A",
  qtn.dist = list(tr1 = "beta", tr2 = "beta"),
  qtn.shape1 = list(tr1 = 1, tr2 = 1),
  qtn.shape2 = list(tr1 = 1, tr2 = 1),
  qtn.ncp = list(tr1 = 0, tr2 = 0)
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype with fixed effect and covariate and environmental random effect

SIMER supports adding Fixed effects, Covariates, and Environmental Random effects to a phenotype. Users should prepare a list of environmental factors setting. Fixed effects, Covariates , and Environmental Random effects are determined by effect, slope, and ratio respectively. A phenotype with Fixed effect, Covariate, and Environmental Random effect in single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Prepare environmental factor list
pop.env <- list(
  F1 = list( # fixed effect 1
    level = c("1", "2"),
    effect = list(tr1 = c(50, 30))
  ), 
  F2 = list( # fixed effect 2
    level = c("d1", "d2", "d3"),
    effect = list(tr1 = c(10, 20, 30))
  ),
  C1 = list( # covariate 1
    level = c(70, 80, 90),
    slope = list(tr1 = 1.5)
  ),
  R1 = list( # random effect 1
    level = c("l1", "l2", "l3"),
    ratio = list(tr1 = 0.1)
  )
)

# Generate genotype simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A")
# Generate annotation simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP, 
  pop.env = pop.env,
  phe.model = list(tr1 = "T1 = A + F1 + F2 + C1 + R1 + E"), # "T1" (Trait 1) consists of Additive effect, F1, F2, C1, R1, and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

A phenotype with Fixed effect, Covariate, and Environmental Random effect in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Prepare environmental factor list
pop.env <- list(
  F1 = list( # fixed effect 1
    level = c("1", "2"),
    effect = list(tr1 = c(50, 30), tr2 = c(50, 30))
  ), 
  F2 = list( # fixed effect 2
    level = c("d1", "d2", "d3"),
    effect = list(tr1 = c(10, 20, 30), tr2 = c(10, 20, 30))
  ),
  C1 = list( # covariate 1
    level = c(70, 80, 90),
    slope = list(tr1 = 1.5, tr2 = 1.5)
  ),
  R1 = list( # random effect 1
    level = c("l1", "l2", "l3"),
    ratio = list(tr1 = 0.1, tr2 = 0.1)
  )
)

# Generate genotype simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A")
# Generate annotation simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP, 
  pop.env = pop.env,
  phe.model = list(
    tr1 = "T1 = A + F1 + F2 + C1 + R1 + E", # "T1" (Trait 1) consists of Additive effect, F1, F2, C1, R1, and Residual effect
    tr2 = "T2 = A + F1 + F2 + C1 + R1 + E"  # "T2" (Trait 1) consists of Additive effect, F1, F2, C1, R1, and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype using GxE model

In a “GxE” model, SIMER adds a Genetic-Environmental interaction effect to the phenotype. Users should prepare the Genetic QTN effect in the Annotation data and environmental factor by pop.env to generate a Genetic-Environmental Individual effect. An example of a Genetic-Environmental interaction in a single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Prepare environmental factor list
pop.env <- list(
  F1 = list( # fixed effect 1
    level = c("1", "2"),
    effect = list(tr1 = c(50, 30))
  ), 
  F2 = list( # fixed effect 2
    level = c("d1", "d2", "d3"),
    effect = list(tr1 = c(10, 20, 30))
  ),
  C1 = list( # covariate 1
    level = c(70, 80, 90),
    slope = list(tr1 = 1.5)
  ),
  R1 = list( # random effect 1
    level = c("l1", "l2", "l3"),
    ratio = list(tr1 = 0.1)
  )
)

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  pop.env = pop.env,
  phe.model = list(
    tr1 = "T1 = A + F1 + F2 + C1 + R1 + A:F1 + E" # "T1" (Trait 1) consists of Additive effect, F1, F2, C1, R1, Additive-F1 interaction effect, and Residual effect
  ),
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3),
  phe.h2GxE = list(tr1 = list("A:F1" = 0.1))
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

An example of Genetic-Environmental interaction in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Prepare environmental factor list
pop.env <- list(
  F1 = list( # fixed effect 1
    level = c("1", "2"),
    effect = list(tr1 = c(50, 30), tr2 = c(50, 30))
  ), 
  F2 = list( # fixed effect 2
    level = c("d1", "d2", "d3"),
    effect = list(tr1 = c(10, 20, 30), tr2 = c(10, 20, 30))
  ),
  C1 = list( # covariate 1
    level = c(70, 80, 90),
    slope = list(tr1 = 1.5, tr2 = 1.5)
  ),
  R1 = list( # random effect 1
    level = c("l1", "l2", "l3"),
    ratio = list(tr1 = 0.1, tr2 = 0.1)
  )
)

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  pop.env = pop.env,
  phe.model = list(
    tr1 = "T1 = A + F1 + F2 + C1 + R1 + A:F1 + E", # "T1" (Trait 1) consists of Additive effect, F1, F2, C1, R1, Additive-F1 interaction effect, and Residual effect
    tr2 = "T2 = A + F1 + F2 + C1 + R1 + A:F1 + E"  # "T2" (Trait 2) consists of Additive effect, F1, F2, C1, R1, Additive-F1 interaction effect, and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.h2GxE = list(tr1 = list("A:F1" = 0.1), tr2 = list("A:F1" = 0.1)),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype using ExE model

In an “ExE” model, SIMER adds Environmental-Environmental interaction effect to phenotype. Users should prepare environmental factor by pop.env for generating Environmental-Environmental Individual effect. An example of Environmental-Environmental interaction in single-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Prepare environmental factor list
pop.env <- list(
  F1 = list( # fixed effect 1
    level = c("1", "2"),
    effect = list(tr1 = c(50, 30))
  ), 
  F2 = list( # fixed effect 2
    level = c("d1", "d2", "d3"),
    effect = list(tr1 = c(10, 20, 30))
  ),
  C1 = list( # covariate 1
    level = c(70, 80, 90),
    slope = list(tr1 = 1.5)
  ),
  R1 = list( # random effect 1
    level = c("l1", "l2", "l3"),
    ratio = list(tr1 = 0.1)
  )
)

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  pop.env = pop.env,
  phe.model = list(
    tr1 = "T1 = A + F1 + F2 + C1 + R1 + F1:R1 + E" # "T1" (Trait 1) consists of Additive effect, F1, F2, C1, R1, F1-R1 interaction effect, and Residual effect
  ),
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3),
  phe.h2GxE = list(tr1 = list("F1:R1" = 0.1))
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

An example of Environmental-Environmental interaction in multiple-trait simulation is displayed as follows:
If users want to output files, please see File output.

# Prepare environmental factor list
pop.env <- list(
  F1 = list( # fixed effect 1
    level = c("1", "2"),
    effect = list(tr1 = c(50, 30), tr2 = c(50, 30))
  ), 
  F2 = list( # fixed effect 2
    level = c("d1", "d2", "d3"),
    effect = list(tr1 = c(10, 20, 30), tr2 = c(10, 20, 30))
  ),
  C1 = list( # covariate 1
    level = c(70, 80, 90),
    slope = list(tr1 = 1.5, tr2 = 1.5)
  ),
  R1 = list( # random effect 1
    level = c("l1", "l2", "l3"),
    ratio = list(tr1 = 0.1, tr2 = 0.1)
  )
)

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10), qtn.model = "A") # Additive effect
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  pop.env = pop.env,
  phe.model = list(
    tr1 = "T1 = A + F1 + F2 + C1 + R1 + F1:R1 + E", # "T1" (Trait 1) consists of Additive effect, F1, F2, C1, R1, F1:R1 interaction effect, and Residual effect
    tr2 = "T2 = A + F1 + F2 + C1 + R1 + F1:R1 + E"  # "T2" (Trait 2) consists of Additive effect, F1, F2, C1, R1, F1:R1 interaction effect, and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.h2GxE = list(tr1 = list("F1:R1" = 0.1), tr2 = list("F1:R1" = 0.1)),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Generate phenotype controlled by varied QTN effect distribution

In the single-trait simulation, the trait can be controlled by varied QTN effect distribution. An example of the single-trait controlled by two-group QTNs is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4, 
  qtn.num = list(tr1 = c(2, 8)), # Group1: 2 QTNs; Group 2: 8 QTNs
  qtn.dist = list(tr1 = c("norm", "norm")),
  qtn.var = list(tr1 = c(0.01, 0.01)), # Group1: genetic variance of QTNs = 0.01; Group2: genetic variance of QTNs = 0.01
  qtn.model = "A"
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(tr1 = "T1 = A + E"), # "T1" (Trait 1) consists of Additive effect and Residual effect
  # phe.var = list(tr1 = 100),
  phe.h2A = list(tr1 = 0.3)
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

An example of the multiple-trait controlled by two-group QTNs is displayed as follows:
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(
  pop.marker = 1e4, 
  qtn.num = list(tr1 = c(2, 8), tr2 = c(2, 8)), # Group1: 2 QTNs; Group 2: 8 QTNs
  qtn.dist = list(tr1 = c("norm", "norm"), tr2 = c("norm", "norm")),
  qtn.var = list(tr1 = c(0.01, 0.01), tr2 = c(0.01, 0.01)), # Group1: genetic variance of QTNs = 0.01; Group2: genetic variance of QTNs = 0.01
  qtn.model = "A"
)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2) # random genotype
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP,
  phe.model = list(
    tr1 = "T1 = A + E", # "T1" (Trait 1) consists of Additive effect and Residual effect
    tr2 = "T2 = A + E"  # "T2" (Trait 2) consists of Additive effect and Residual effect
  ),
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.h2A = list(tr1 = 0.3, tr2 = 0.3),
  phe.corA = matrix(c(1, 0.5, 0.5, 1), 2, 2) # Additive genetic correlation
)

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)

Population Simulation of Multiple-Generation with Genotype and Phenotype

SIMER imitates the reproductive process of organisms to generate a Multiple-Generation population. The genotype data and phenotype data of the population are screened by single-trait selection or multiple-trait selection, and then those data are amplified by species-specific reproduction.

Gallery of population simulation parameters

Individual selection for a single trait

Paramater	Default	Options	Description
pop.sel	NULL	list	the selected males and females.
ps	c(0.8, 0.8)	num vector	if ps <= 1, fraction selected in selection of males and females; if ps > 1, ps is number of selected males and females.
decr	TRUE	TRUE or FALSE	whether the sort order is decreasing.
sel.crit	“pheno”	character	the selection criteria, it can be “TBV”, “TGV”, and “pheno”.
sel.single	“ind”	character	the single-trait selection method, it can be “ind”, “fam”, “infam”, and “comb”.
sel.multi	“index”	character	the multiple-trait selection method, it can be “index”, “indcul”, and “tmd”.
index.wt	c(0.5, 0.5)	num vector	the weight of each trait for multiple-trait selection.
index.tdm	1	num	the index of tandem selection for multiple-trait selection.
goal.perc	0.1	num	the percentage of goal more than the mean of scores of individuals.
pass.perc	0.9	num	the percentage of expected excellent individuals.

Paramater	Default	Options	Description
pop.gen	1	num	the generations of simulated population.
reprod.way	“randmate”	character	reproduction method, it consists of “clone”, “dh”, “selfpol”, “randmate”, “randexself”, “assort”, “disassort”, “2waycro”, “3waycro”, “4waycro”, “backcro”, and “userped”.
sex.rate	0.5	num	the male rate in the population.
prog	2	num	the progeny number of an individual.

Individual selection is a selection method based on the phenotype of individual traits, which is also known as mixed selection or collective selection. This selection method is simple and easy to use for traits with high heritability.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)

Family selection for a single trait

Family selection is a selection method by family based on the average of the family. This selection method is used for traits with low heritability.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "fam")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)

Within-family selection for a single trait

Within-family selection is a selection method based on the deviation of individual phenotype and family mean value in each family. This selection method is used for traits with low heritability and small families.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "infam")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)

Combined selection for a single trait

Combined selection is a selection method based on weighed combination of the deviation of individual phenotype and family mean value.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "comb")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)

Tandem selection for multiple traits

Tandem selection is a method for sequentially selecting a plurality of target traits one by one. The index of the selected trait is index.tdm and this parameter should not be controlled by Users.
If users want to output files, please see File output.

# Generate genotype simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10))
# Generate annotation simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP, 
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.model = list(
    tr1 = "T1 = A + E",
    tr2 = "T2 = A + E"
  )
)
# Generate selection parameters
SP <- param.sel(SP = SP, sel.multi = "tdm")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)

Independent culling selection for multiple traits

Set a minimum selection criterion for each target trait. Then a Independent culling selection will eliminate this individual when the candidate’s performance on any trait is lower than the corresponding criteria.
If users want to output files, please see File output.

# Generate genotype simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10))
# Generate annotation simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP, 
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.model = list(
    tr1 = "T1 = A + E",
    tr2 = "T2 = A + E"
  )
)
# Generate selection parameters
SP <- param.sel(SP = SP, sel.multi = "indcul")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)

Index selection for multiple traits

Index selection is a comprehensive selection that will consider several traits based on their respective heritabilities, phenotypic variances, economic weights, corresponding genetic correlations, and phenotypes. Then, SIMER calculates the index value of each trait, eliminates it, or selects it according to its level. Users can set the weight of each trait at index.wt.
If users want to output files, please see File output.

# Generate genotype simulation parameters
SP <- param.annot(pop.marker = 1e4, qtn.num = list(tr1 = 10, tr2 = 10))
# Generate annotation simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(
  SP = SP, 
  # phe.var = list(tr1 = 100, tr2 = 100),
  phe.model = list(
    tr1 = "T1 = A + E",
    tr2 = "T2 = A + E"
  )
)
# Generate selection parameters
SP <- param.sel(SP = SP, sel.multi = "index")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)

Clone for plants

Clone is a sexual reproduction method that does not involve germ cells and does not require a process of fertilization, but directly forms a new individual’s reproductive mode from a part of the mother. Sex of offspring will be 0 in the clone.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "clone")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Doubled haploid for plants

Doubled haploid is a reproduction method for breeding workers to obtain haploid plants. It induces a doubling of the number of chromosomes and restores the number of chromosomes in normal plants. Sex of offspring will be 0 in dh.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "dh")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Self-pollination for plants and micro-organisms

Self-pollination refers to the combination of male and female gametes from the same individual or between individuals from the same clonal breeding line. Sex of offspring will be 0 in selfpol.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "selfpol")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Random mating for plants and animals

In random mating, any female or male individual has the same probability to mate with any member of opposite sex in a sexually reproducing organism. Sex of offspring in random mating is controlled by sex.ratio in randmate.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "randmate")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Random mating excluding self-pollination for animals

In random mating excluding self-pollination, an individual cannot mate with itself. Sex of offspring in random mating is controlled by sex.ratio in randexself.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "randexself")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Assortative mating for plants and animals

In assortative mating, mated pairs are of the same phenotype more often than would occur by chance. Sex of offspring in assortative mating is controlled by sex.ratio in assort.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "assort")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Disassortative mating for plants and animals

In disassortative mating, mated pairs are of the same phenotype less often than would occur by chance. Sex of offspring in disassortative mating is controlled by sex.ratio in disassort.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "disassort")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Two-way cross for animals

The Two-way cross method needs to use sex to distinguish two different breeds, in which the first breed is sire and the second breed is dam.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "2waycro")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Two different breeds are cut by sex
SP$pheno$pop$gen1$sex <- rep(c(1, 2), c(50, 50))
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Three-way cross for animals

The Three-way cross method needs to use sex to distinguish three different breeds, in which the first breed is sire and the second breed is dam in the first two-way cross, and the third breed is terminal sire.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "3waycro")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Three different breeds are cut by sex
SP$pheno$pop$gen1$sex <- rep(c(1, 2, 1), c(30, 30, 40))
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Four-way cross for animals

The Four-way cross method needs to use sex to distinguish four different breeds, in which the first breed is sire and the second breed is dam in the first two-way cross, the third breed is sire and the fourth breed is dam in the second two-way cross.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "4waycro")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Four different breeds are cut by sex
SP$pheno$pop$gen1$sex <- rep(c(1, 2, 1, 2), c(25, 25, 25, 25))
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

Back cross for animal

The Back cross method needs to use sex to distinguish two different breeds, in which the first breed is always sire in each generation and the second breed is dam in the first two-way cross.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate selection parameters
SP <- param.sel(SP = SP, sel.single = "ind")
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "backcro")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Two different breeds are cut by sex
SP$pheno$pop$gen1$sex <- rep(c(1, 2), c(50, 50))
# Run selection
SP <- selects(SP)
# Run reproduction
SP <- reproduces(SP)

User-designed pedigree mating for plants and animals

User-designed pedigree mating needs a specific user-designed pedigree to control the mating process. The first column is sample id, the second column is paternal id, and the third column is maternal id. Please make sure that paternal id and maternal id can match the genotype data.
If users want to output files, please see File output.

# Generate annotation simulation parameters
SP <- param.annot(pop.marker = 1e4)
# Generate genotype simulation parameters
SP <- param.geno(SP = SP, pop.ind = 1e2)
# Generate phenotype simulation parameters
SP <- param.pheno(SP = SP, phe.h2A = list(tr1 = 0.3))
# Generate reproduction parameters
SP <- param.reprod(SP = SP, reprod.way = "userped")

# Run annotation simulation
SP <- annotation(SP)
# Run genotype simulation
SP <- genotype(SP)
# Run phenotype simulation
SP <- phenotype(SP)
# Run reproduction
SP <- reproduces(SP)

AN EASY WAY TO GENERATE A POPULATION

The above methods are to generate populations step by step, which are easy to understand. Actually, SIMER can generate a population directly in a MORE CONVENIENT way.
If users want to output files, please see File output.

# Generate all simulation parameters
SP <- param.simer(qtn.num = list(tr1 = 10), pop.marker = 1e4, pop.ind = 1e2, sel.single = "ind", reprod.way = "randmate")

# Run Simer
SP <- simer(SP)

Breeding Program Design

After generating a population, further work can be done. Breeders wish to evaluate their Breeding Program Design. To save money and time, SIMER can assist breeders to evaluate their Breeding Program Design by simulation.

Gallery of breeding program design parameters

Preparation of a breeding program design

Paramater	Default	Options	Description
jsonFile	NULL	character	the path of JSON file.
hiblupPath	““	character	the path of HIBLUP software.
out	“simer.qc”	character	the prefix of output files.
dataQC	TRUE	TRUE or FALSE	whether to make data quality control.
buildModel	TRUE	TRUE or FALSE	whether to build EBV model.
buildIndex	TRUE	TRUE or FALSE	whether to build Selection Index.
ncpus	10	num	the number of threads used, if NULL, (logical core number - 1) is automatically used.
verbose	TRUE	TRUE or FALSE	whether to print detail.

Breeding program design should be stored on a JSON file. > plan1.json
>> genotype: the absolute path or relative path to JSON file of genotype data
>> pedigree: the filename with absolute path or relative path to JSON file of pedigree data
>> selection_index: the economic weight of phenotype for each trait
>> threads: the threads number used in multiple threads computation
>> genetic_progress: the genetic progress of a breeding plan
>> breeding_value_index: the economic weight of breeding value for each trait
>> auto_optimization: optimizing EBV estimated model and selection index automatically
>> quality_control_plan: the quality control plan for genotype, pedigree, and phenotype

{
    "genotype": "../02plinkb",
    "pedigree": "../05others/pedigree.txt",
    "selection_index": "100 - 0.2 * T1 + 0.8 * T2",
    "threads": 16,
    "genetic_progress": [],
    "breeding_value_index": "-0.2 * T1 + 0.8 * T2",
    "auto_optimization": true,
    "quality_control_plan": {
        "genotype_quality_control":{
            "filter": "F1 == 'Male'",
            "filter_geno": 0.1,
            "filter_mind": 0.1,
            "filter_maf": 0.05,
            "filter_hwe": 0.001
        },
        "pedigree_quality_control":{
            "standard_ID": false,
            "candidate_sire_file": [],
            "candidate_dam_file": [],
            "exclude_threshold": 0.1, 
            "assign_threshold": 0.05
        },
        "phenotype_quality_control":[
            {
                "job_name": "Data_Quality_Control_Demo",
                "sample_info": "../05others/phenotype.txt",
                "repeated_records": false,
                "multi_trait": true,
                "filter": "F1 == 'Male'",
                "job_traits": [
                    {
                        "traits": "T1",
                        "definition": "T1",
                        "range": []
                    },
                    {
                        "traits": "T2",
                        "definition": "T2",
                        "range": []
                    }
                ]
            }
        ]
    },
    "breeding_plan":[
        {
            "job_name": "EBV_Model_Demo",
            "sample_info": "../05others/phenotype.txt",
            "repeated_records": false,
            "multi_trait": true,
            "vc_vars": [],
            "vc_covars": [],
            "random_ratio": 0.05,
            "job_traits": [
                {
                    "traits": "T1",
                    "covariates": [],
                    "fixed_effects": ["F1", "F2"],
                    "random_effects": ["R1"]
                },
                {
                    "traits": "T2",
                    "covariates": [],
                    "fixed_effects": ["F1", "F2"],
                    "random_effects": ["R1"]
                }
            ]
        }
    ]
}

Evaluation of a breeding program design

To evaluate the breeding program design, SIMER completes the following three tasks:
(1) Data quality control for genotype, pedigree, and phenotype
(2) Model optimization (i.e., the most suitable covariate, fixed effect, and random effect)
(3) Construction of Selection Index and calculation of Genetic Progress

# Get JSON file
jsonFile <- system.file("extdata", "04breeding_plan", "plan1.json", package = "simer")

# It needs "plink" and "hiblup" software
jsonList <- simer.Data.Json(jsonFile = jsonFile)

Global Options

Users can use global parameters to control the population properties , the number of threads used for simulation, and the output of simulation data.

Gallery of global parameters

Counts of total population size

Paramater	Default	Options	Description
replication	1	num	the replication times of simulation.
seed.sim	random	num	simulation random seed.
out	“simer”	character	the prefix of output files.
outpath	NULL	character	the path of output files, Simer writes files only if outpath is not “NULL”.
out.format	“numeric”	“numeric” or “plink”	“numeric” or “plink”, the data format of output files.
pop.gen	1	num	the generations of simulated population.
out.geno.gen	1	num vector	the output generations of genotype data.
out.pheno.gen	1	num vector	the output generations of phenotype data.
useAllGeno	FALSE	TRUE or FALSE	whether to use all genotype data to simulate phenotype.
missing.geno	NULL	num	the ratio of missing values in genotype data.
missing.phe	NULL	list	the ratio of missing values in phenotype data.
ncpus	0	num	the number of threads used, if NULL, (logical core number - 1) is automatically used.
verbose	TRUE	TRUE or FALSE	whether to print detail.

Users can calculate the number of individuals per generation using IndPerGen directly.

pop <- generate.pop(pop.ind = 100)
count.ind <- IndPerGen(pop = pop, pop.gen = 2, ps = c(0.8, 0.8), reprod.way = "randmate", sex.rate = 0.5, prog = 2)

Multi-thread computation

SIMER runs on multiple threads. Users can easily change the number of threads used for simulation by the following:

# Generate all simulation parameters
SP <- param.simer(out = "simer", ncpus = 2)

# Run Simer
SP <- simer(SP)

Multi-population simulation

Simulation of multiple populations can be realized by for by using R software.

# Replication times
rep <- 2

# Result list
SPs <- rep(list(NULL), rep)

for (i in 1:rep) {
  # Generate all simulation parameters
  SP <- param.simer(replication = i, seed.sim = i, out = "simer")

  # Run Simer
  SPs[[i]] <- simer(SP)
}

File output

SIMER will not output files by default. A series of files with the prefix out will output when specifying outpath.

### 01 Numeric Format ###
# Generate all simulation parameters
SP <- param.simer(
  # SP = SP, # uncomment it when users already have a "SP"
  out = "simer",
  outpath = getwd(),
  out.format = "numeric"
 )

# Run Simer
SP <- simer(SP)

### 02 PLINK Binary Format ###
# Generate all simulation parameters
SP <- param.simer(
  # SP = SP, # uncomment it when users already have a "SP"
  out = "simer",
  outpath = getwd(),
  out.format = "plink"
)

# Run Simer
SP <- simer(SP)

### 03 Numeric Format with missing values in genotype and phenotype ###
# Generate all simulation parameters
SP <- param.simer(
  # SP = SP, # uncomment it when users already have a "SP"
  out = "simer",
  outpath = getwd(),
  out.format = "numeric",
  missing.geno = 0.01,
  missing.phe = list(tr1 = 0.5)
)

# Run Simer
SP <- simer(SP)

### 04 PLINK Binary Format with missing values in genotype and phenotype ###
# Generate all simulation parameters
SP <- param.simer(
  # SP = SP, # uncomment it when users already have a "SP"
  out = "simer",
  outpath = getwd(),
  out.format = "plink",
  missing.geno = 0.01,
  missing.phe = list(tr1 = 0.5)
)

# Run Simer
SP <- simer(SP)

Generation-selective output

Output of genotype and phenotype can be generation-selective using out.geno.gen and out.pheno.gen.

# Generate all simulation parameters
SP <- param.simer(out = "simer", outpath = getwd(), pop.gen = 2, out.geno.gen = 1:2, out.pheno.gen = 1:2)

# Run Simer
SP <- simer(SP)

Output

SIMER outputs data including annotation data, genotype data, and phenotype data in the following two format.
Numeric format:
simer.geno.ind contains indice of genotyped individuals;
simer.geno.desc and simer.geno.bin contain genotype matrix of all individuals;
simer.map contains input map with block information and recombination information;
simer.ped contains pedigree of individuals;
simer.phe contains phenotype of individuals.
PLINK Binary format:
simer.bim contains marker information of genotype data;
simer.bed contains genotype data in binary format;
simer.fam contains sample information of genotype data;
simer.ped contains pedigree of individuals;
simer.phe contains phenotype of individuals.

Annotation data

Annotation data contains SNP name, Chromosome name, Base Position, ALT, REF, and the QTN genetic effect. Note that only markers selected as QTNs have values.

# Generate all simulation parameters
SP <- param.simer(out = "simer")

# Run Simer
SP <- simer(SP)

# Show annotation data
head(SP$map$pop.map)
  SNP Chrom     BP ALT REF QTN1_A
1  M1     1 130693   C   A     NA
2  M2     1 168793   G   A     NA
3  M3     1 286553   A   T     NA
4  M4     1 306913   C   G     NA
5  M5     1 350926   T   A     NA
6  M6     1 355889   A   C     NA

Genotype data

# Generate all simulation parameters
SP <- param.simer(out = "simer")

# Run Simer
SP <- simer(SP)

# Show genotype data
print(SP$geno$pop.geno)
$gen1
An object of class "big.matrix"
Slot "address":
<pointer: 0x00000000176f09e0>


$gen2
An object of class "big.matrix"
Slot "address":
<pointer: 0x00000000176ef940>

print(SP$geno$pop.geno$gen1[1:6, 1:6])
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    0    2    0    1    0    2
[2,]    1    1    1    1    0    0
[3,]    0    1    2    2    1    0
[4,]    2    0    1    1    1    0
[5,]    2    1    0    1    2    1
[6,]    1    2    1    1    1    2

Phenotype data

Phenotype data contains sample ID, generation index, family index, within-family index, sire, dam, sex, phenotype, TBV, TGV, and other effects.

# Generate all simulation parameters
SP <- param.simer(out = "simer")

# Run Simer
SP <- simer(SP)

# Show phenotype data
head(SP$pheno$pop$gen1)
  index gen fam infam sir dam sex          T1     T1_TBV     T1_TGV   T1_A_eff    T1_E_eff
1     1   1   1     1   0   0   1  -0.4934935 -1.3507888 -1.3507888 -1.3507888   0.8572953
2     2   1   2     2   0   0   1   7.7710404 -1.6756353 -1.6756353 -1.6756353   9.4466757
3     3   1   3     3   0   0   1  -4.6567338 -2.2608387 -2.2608387 -2.2608387  -2.3958951
4     4   1   4     4   0   0   1  -5.9064589 -1.7394139 -1.7394139 -1.7394139  -4.1670450
5     5   1   5     5   0   0   1 -16.7438931 -2.8000846 -2.8000846 -2.8000846 -13.9438085
6     6   1   6     6   0   0   1   6.0043912  0.3413561  0.3413561  0.3413561   5.6630351

Citation

FAQ and Hints

:sos: Question2: When installing packages from Github with “devtools”, an error occurred:

Error in curl::curl_fetch_disk(url, x$path, handle = handle): Problem with the SSL CA cert (path? access rights?)

library(httr)
set_config(config(ssl_verifypeer = 0L))

Questions, suggestions, and bug reports are welcome and appreciated. :arrow_right:

📫 HIBLUP: Versatile and easy-to-use GS toolbox.	🏔️ IAnimal: an omics knowledgebase for animals.
🚴‍♂️ KAML: Advanced GS method for complex traits.	📊 CMplot: A drawing tool for genetic analyses.
📮 rMVP: Efficient and easy-to-use GWAS tool.

2	1	0	1	0	…
1	2	0	1	0	…
1	1	2	1	0	…
1	1	0	2	1	…
0	0	0	0	2	…

SIMER

Data Simulation for Life Science and Breeding

Authors:

:toolbox: Relevant software tools for genetic analyses and genomic breeding

Contents

Installation

Installation

Data Preparation

Genotype

Genetic map

Pedigree

Data Input

Basic

Optional

Quick Start

Quick Start for Population Simulation

Quick Start for Genotype Simulation

Quick Start for Phenotype Simulation

Genotype Simulation

Gallery of genotype simulation parameters

Generate an external or species-specific or random genetic map

Generate an external or species-specific or random genotype matrix

Generate a genotype matrix with complete linkage disequilibrium

Add chromosome crossovers and mutations to genotype matrix

Phenotype Simulation

Gallery of phenotype simulation parameters

Generate phenotype using an external or species-specific or random genotype matrix

Generate continuous phenotype

Generate case-control phenotype

Generate categorical phenotype

Generate phenotype using A model

Generate phenotype using AD model

Generate phenotype using GxG model

Generate phenotype using Repeated Record model

Generate phenotype controlled by QTNs subject to Normal distribution

Generate phenotype controlled by QTNs subject to Geometric distribution

Generate phenotype controlled by QTNs subject to Gamma distribution

Generate phenotype controlled by QTNs subject to Beta distribution

Generate phenotype with fixed effect and covariate and environmental random effect

Generate phenotype using GxE model

Generate phenotype using ExE model

Generate phenotype controlled by varied QTN effect distribution

Population Simulation of Multiple-Generation with Genotype and Phenotype

Gallery of population simulation parameters

Individual selection for a single trait

Family selection for a single trait

Within-family selection for a single trait

Combined selection for a single trait

Tandem selection for multiple traits

Independent culling selection for multiple traits

Index selection for multiple traits

Clone for plants

Doubled haploid for plants

Self-pollination for plants and micro-organisms

Random mating for plants and animals

Random mating excluding self-pollination for animals

Assortative mating for plants and animals

Disassortative mating for plants and animals

Two-way cross for animals

Three-way cross for animals

Four-way cross for animals

Back cross for animal

User-designed pedigree mating for plants and animals

AN EASY WAY TO GENERATE A POPULATION

Breeding Program Design

Gallery of breeding program design parameters

Preparation of a breeding program design

Evaluation of a breeding program design

Global Options

Gallery of global parameters

Counts of total population size

Multi-thread computation

Multi-population simulation

File output

Generation-selective output

Output

Annotation data

Genotype data

Phenotype data

Citation