This tutorial will guide you on how to perform GWAS with SLOPE. Analysis consists of three simple steps.
You need to provide paths to three files:
library(geneSLOPE)
famFile <- system.file("extdata", "plinkPhenotypeExample.fam", package = "geneSLOPE")
mapFile <- system.file("extdata", "plinkMapExample.map", package = "geneSLOPE")
snpsFile <- system.file("extdata", "plinkDataExample.raw", package = "geneSLOPE")When you have phenotype you can move to reading snp data. Depending on data size reading SNPs may long time. As data is very large, snps are filtered with their marginal test p-value. All snps which p-values are larger than threshold \(pValMax\) will be truncated. For details on how to choose \(pValMax\) see How changing parameters affects my analysis?
screening.result <- screen_snps(snpsFile, mapFile, phenotype, pValMax = 0.05, 
                      chunkSize = 1e2, verbose=FALSE)Parameter verbose=FALSE suppresses progress bar. Default value is TRUE.
User look into result of reading and screening dataset
## Object of class screeningResult
## $X: data matrix
##   90  observations
##   52  snps
## 1000  SNPs were screened
## 52  snps had p-value smaller than  0.05  in marginal testWhen data is successfully read, one can move to the second step of analysis.
Last step of analysis is using SLOPE
## Warning in select_snps(clumping.result, fdr = 0.1): All lambdas are equal. SLOPE does not guarantee
##             False Discovery Rate controlAs before one can plot and summarize results
## Object of class selectionResult
## 2 snps selected out of 41 clump representatives
## Effect size for selected snps (absolute values)
##  Min:  3.640963 
##  Mean:  3.768299 
##  Max:  3.895635 
## R square of the final model:  0.9756304 
## Kink value:  1Like with result of clumping, it is possible to identify interactively clump number which contains specific SNP selected by SLOPE. The procedure is the following. First plot the whole genome, then run function and click on SNP of interest.
When clump is identified one can zoom into it
It is easy to get information about selected SNPs. To get indices of columns in original SNP matrix they refer to use
##  rs2719295_T rs17546815_T 
##          222          573If .map file was given, then one can get more information about SNPs
##     chromosome         rs genetic_distance_(morgans)
## 222          8  rs2719295                    34.2919
## 573         11 rs17546815                   113.7420
##     base_pair_position_(bp_units)
## 222                      34291873
## 573                     113741770For information about SNPs that are part of specific clump use
## Summary of 1 selected clump
##     chromosome         rs genetic_distance_(morgans)
## 222          8  rs2719295                    34.2919
## 377          2 rs11124642                    38.6580
## 598          2  rs4672803                   217.2340
## 906          6   rs325120                   147.8600
##     base_pair_position_(bp_units)
## 222                      34291873
## 377                      38657967
## 598                     217233745
## 906                     147860161There are three numerical parameters that influence result
Input: \(rho \in (0, 1)\);