Title: | Ethnicity Annotation from Whole-Exome and Targeted Sequencing Data |
---|---|
Description: | Reliable and rapid ethnicity annotation from whole exome and targeted sequencing data. |
Authors: | Alessandro Romanel [aut, cre], Davide Dalfovo [aut] |
Maintainer: | Alessandro Romanel <[email protected]> |
License: | GPL-3 |
Version: | 3.0.1 |
Built: | 2024-11-07 03:41:41 UTC |
Source: | https://github.com/cibiobcg/ethseq |
This function performs ancestry analysis of a set of samples ad reports the results.
ethseq.Analysis( target.vcf = NA, target.gds = NA, bam.list = NA, out.dir = tempdir(), model.gds = NA, model.available = NA, model.assembly = "hg38", model.pop = "All", model.folder = tempdir(), run.genotype = FALSE, aseq.path = tempdir(), mbq = 20, mrq = 20, mdc = 10, cores = 1, verbose = TRUE, composite.model.call.rate = 1, refinement.analysis = NA, space = "2D", bam.chr.encoding = FALSE )
ethseq.Analysis( target.vcf = NA, target.gds = NA, bam.list = NA, out.dir = tempdir(), model.gds = NA, model.available = NA, model.assembly = "hg38", model.pop = "All", model.folder = tempdir(), run.genotype = FALSE, aseq.path = tempdir(), mbq = 20, mrq = 20, mdc = 10, cores = 1, verbose = TRUE, composite.model.call.rate = 1, refinement.analysis = NA, space = "2D", bam.chr.encoding = FALSE )
target.vcf |
Path to the sample's genotypes in VCF format |
target.gds |
Path to the sample's genotypes in GDS format |
bam.list |
Path to a file containing a list of BAM files paths |
out.dir |
Path to the folder where the output of the analysis is saved |
model.gds |
Path to a GDS file specifying the reference model |
model.available |
String specifying the pre-computed reference model to use |
model.assembly |
String value indicating the assembly version to download for the pre-build models |
model.pop |
String value indicating the population to download for the pre-build models |
model.folder |
Path to the folder where reference models are already present or downloaded when needed |
run.genotype |
Logical values indicating whether the ASEQ genotype should be run |
aseq.path |
Path to the folder where ASEQ binary is available or is downloaded when needed |
mbq |
Minmum base quality used in the pileup by ASEQ |
mrq |
Minimum read quality used in the piluep by ASEQ |
mdc |
Minimum read count acceptable for genotype inference by ASEQ |
cores |
Number of parallel cores used for the analysis |
verbose |
Print detailed information |
composite.model.call.rate |
SNP call rate used to run Principal Component Analysis (PCA) |
refinement.analysis |
Matrix specifying a tree of ancestry sets |
space |
Dimensions of PCA space used to infer ancestry (2D or 3D) |
bam.chr.encoding |
Logical value indicating whether input BAM files have chromosomes encoded with "chr" prefix |
Logical value indicating the success of the analysis
This function creates a GDS reference model that can be used to performe EthSEQ ancestry analysis
ethseq.RM( vcf.fn, annotations, out.dir = "./", model.name = "Reference.Model", bed.fn = NA, verbose = TRUE, call.rate = 1, cores = 1 )
ethseq.RM( vcf.fn, annotations, out.dir = "./", model.name = "Reference.Model", bed.fn = NA, verbose = TRUE, call.rate = 1, cores = 1 )
vcf.fn |
vector of paths to genotype files in VCF format |
annotations |
data.frame with mapping of all samples names, ancestries and gender |
out.dir |
Path to output folder |
model.name |
Name of the output model |
bed.fn |
path to BED file with regions of interest |
verbose |
Print detailed information |
call.rate |
SNPs call rate cutoff for inclusion in the final reference model |
cores |
How many parallel cores to use in the reference model generation |
Logical value indicating the success of the analysis
This function prints the list of all available models.
getModelsList()
getModelsList()
data.frame of all available models to use with specified assembly and population
This function prints the list of all 1,000 Genomes Project samples used to build the reference models.
getSamplesInfo()
getSamplesInfo()