Command line options
Systematic Long Range Phasing for phasing and imputing genotypes from isolated founder populations or other populations where each individual has at least one other individual sharing each of his/her chromosomes. The software is command line based and will read various input formats and output to few other. I recommend using VCF as the input and output format, as it seems to be most versatile and best defined. The FAD format is old and should not be used but its specification can be found here
The list of options below is quite long, but to get started, you only need "--vcfFile" and "--geneticMap" that will define your input genotypes and the genetic map of the chromosome.
Usage: SLRP [options]
Options
--version show program's version number and exit
-h, --help show this help message and exit
Input options: Options defining files defining the input. All options will take gzip compressed files
-f FILE, --fadFile=FILE
Name of the genotype file in FAD format. [default:[]]
-v FILE, --vcfFile=FILE
Name of the VCF file with genotypes.
-t FILE, --tpedFile=FILE
Name of the genotype file in tped fromat with
genotypes coded as A,C,G,T. [default:[]]
-C CHROM, --chrom=CHROM
Use this chromosome (for tped input, this must be
1-22,X,Y,XY or MT) [default:none]
-m FILE, --geneticMap=FILE
Input file with the genetic map in hapmap format
[default:none]
--loadLikelihoods=FILE
Load initial likelihoods from a file (Probably a quite
insensible thing to do) [default:none]
-i FILE, --ibdFile=FILE
Name of the input IBD file [default:none]
-D FILE, --densePanel=FILE
File listing the individuals in the 'dense' genotype
panel. This is the set of individuals used as source
of genotype imputation or target of genotype
improvement (or phasing with prephased non-dense
haplotypes). Note that white spaces in the individual
names will be replaced with ':'[default:none] IMPUTATION
TO A DENSE PANEL IS CURRENTLY UNTESTED AND LIKELY NOT TO WORK!
-l NUM, --scoreLim=NUM
Minimum score above which to use the IBD matches
loaded from a file [default:0.0]
-R FILE, --freq=FILE
File to read the allele frequencies from.
[default:none]
--famFile=FILE Name of the file listing the sample names in the FAD
file. [default:none]
Output options: Options defining files to store the output. Most options will gzip compress files with .gz ending
-o FILE, --outFile=FILE
Name of the long range phased FAD output file.
Depreciated for difficulties of allele coding.
Recomend using --freq with this [default:none]
-O FILE, --outVCF=FILE
Name of the long range phased VCF output file
[default:SLRP_12074.vcf]
--IMPUTEfile=FILE Output the imputed results to FILE in IMPUTE format
[default:none]
-S FILE, --ibdSegCalls=FILE
Output IBD segments between the indivdiduals to FILE
[default:none]
-Q FILE, --outQualities=FILE
Output quality scores for each site. The scores are
negative base 10 log posterior probability ratios
between the two phasings. (Makes no sense for homs)
[default:none]
-c FILE, --ibdCover=FILE
Output the number of individuals sharing a trackt IBD.
[default:none]
-W FILE, --writeFreq=FILE
File to write the alele frequency estimates to
[default:none]
-L FILE, --likeFile=FILE
Name of the file for final MAP values as negative
natural logarithm [default:none]
Model parameter options: Options specifying the model parameters that have effect on the results.
--ExpectedIBS=NUM Expected length of an non IBD IBS segment (in
centiMorgans) [default:1.0]
--ExpectedIBD=NUM Expected length of an IBD segment (in centiMorgans)
[default:10.0]
-p NUM, --prob_ibd=NUM
Probability of two haplotypes in the population to
match [default:none]
--IBDtransLimit=NUM
Upper limit for probability of noIBD to IBD transition
between two markers. Default taken from population
probabililties. [default: 4 * prob_ibd]
-T NUM, --CallThreshold=NUM
Minimum fold difference in posterior probability of
two most probable phases to be called. If given, takes
precedence over phredThreshold. [default:none]
-P NUM, --phredThreshold=NUM
Minimum phred scaled quality score to call phases.
[default:10.0]
-e NUM, --genotypingErrRate=NUM
Estimated genotyping error rate [default:0.001]
--outputInferredGenotypes
Output haplotypes that might have different genotypes
from input. These might be more accurate than called
from the genotyping assay. [default:False]
Computational complexity options: Options mostly affecting computational complexity of the process. These should only be adjusted if the program takes too much time or memory. The effect of these options on the quality of the output should be low, but your mileage may vary.
--float Use floats, instead of doubles. Saves half the memory
but might affect numerical precision and stability.
[default:False]
-n NUM, --procs=NUM
Number of processors to use in parallel [default:1]
--slice_length=NUM Length of alignment slice in markers (to save memory)
[default:-1]
-F, --fastPreProc Do pre-processing (putative IBD segment finding) with
fast sweepline method, which disregards prior phase
information and IBD-noIBD transition probabilities
[default:False]
--IBDcoverLimit=NUM
Soft lower limit for number of IBD sharing. Only do
message passing on the longest NUM segments
covering a locus. The selection is greedy, hence there
might be more than minimum number of segments used.
For large number of individuals, 10 might be a good
value. Non positive value turn off this limit.
[default: 15]
--minIBDlength=NUM Hard lower limit for length of IBD segment in markers.
This should speed up computation by disregarding
uninformative IBS segments. [default: 10]
-I NUM, --iterations=NUM
Maximum number of iterations to run the message
passing. [default:30]
-d NUM, --damping=NUM
Damping factor for message updates [default:0.75]
Untouchable options: These options are no use to regular user and should not be touched. They are likely to crash the program or give completely wrong results. YOU HAVE BEEN WARNED!
--verbose Output some more diagnostics [default:False]
--seed=NUM Seed for random number generator [default:none]
--use_sum_product Use sum-product (marginal posteriors) instead max-
product (maximum a posteriori) algorithm.
--test Activate some testing thingies [default:False]
--intermediate Save intermediate FAD and IBD files [default:False]
--mpi Use MPI to distribute the computational and memory
load [default:False]