Command line interface#
The command line interface in tsinfer
is intended to provide a convenient
interface to the high-level API functionality. There are two
equivalent ways to invoke this program:
$ tsinfer
or
$ python3 -m tsinfer
The first form is more intuitive and works well most of the time. The second form is useful when multiple versions of Python are installed or if the tsinfer executable is not installed on your path.
The tsinfer program has five subcommands: list prints a summary of the data held in one of tsinfer’s file formats; infer runs the complete inference process for a given input samples file; and generate-ancestors, match-ancestors and match-samples run the three parts of this inference process as separate steps. Running the inference as separate steps like this is recommended for large inferences as it allows for greater control over the inference process.
Argument details#
Command line interface for tsinfer.
usage: tsinfer [-h] [-V]
{generate-ancestors,ga,match-ancestors,ma,augment-ancestors,aa,match-samples,ms,infer,list,ls,verify}
...
Positional Arguments#
- subcommand
Possible choices: generate-ancestors, ga, match-ancestors, ma, augment-ancestors, aa, match-samples, ms, infer, list, ls, verify
Named Arguments#
- -V, --version
show program’s version number and exit
Sub-commands#
generate-ancestors (ga)#
Generates a set of ancestors from the input sample data and stores the results in a tsinfer ancestors file.
tsinfer generate-ancestors [-h] [-a ANCESTORS] [--num-threads NUM_THREADS]
[--num-flush-threads NUM_FLUSH_THREADS]
[--progress] [-v]
[--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
samples
Positional Arguments#
- samples
The input sample data in tsinfer ‘samples’ format. Please see the documentation at https://tskit.dev/tsinfer/docs/ for information on how to import data into this format.
Named Arguments#
- -a, --ancestors
The path to the ancestor data file in tsinfer ‘ancestors’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors’
- --num-threads, -t
The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
- --num-flush-threads, -F
The number of data flush threads to use. If < 1, all data is flushed synchronously in the main thread (default=2)
- --progress, -p
Show a progress monitor.
- -v, --verbosity
Increase the verbosity
- --log-section, -L
Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads
Log messages only for the specified module
match-ancestors (ma)#
Matches the ancestors built by the ‘generate-ancestors’ command against each other using the model information specified in the input file and writes the output to a tskit .trees file.
tsinfer match-ancestors [-h] [-v]
[--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
[-a ANCESTORS] [-A ANCESTORS_TREES]
[--num-threads NUM_THREADS] [--progress]
[--no-path-compression]
[--recombination-rate RECOMBINATION_RATE]
[--mismatch-ratio MISMATCH_RATIO]
samples
Positional Arguments#
- samples
The input sample data in tsinfer ‘samples’ format. Please see the documentation at https://tskit.dev/tsinfer/docs/ for information on how to import data into this format.
Named Arguments#
- -v, --verbosity
Increase the verbosity
- --log-section, -L
Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads
Log messages only for the specified module
- -a, --ancestors
The path to the ancestor data file in tsinfer ‘ancestors’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors’
- -A, --ancestors-trees
The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’
- --num-threads, -t
The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
- --progress, -p
Show a progress monitor.
- --no-path-compression
Disable path compression
- --recombination-rate
The recombination rate per unit genome
- --mismatch-ratio
The mismatch ratio: measures the relative importance of multiple mutation/error versus recombination during inference. This defaults to unity if a recombination rate or map are specified.
augment-ancestors (aa)#
Augments the ancestors tree sequence by adding a subset of the samples
tsinfer augment-ancestors [-h] [-n NUM_SAMPLES] [-A ANCESTORS_TREES] [-v]
[--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
[--no-path-compression] [--num-threads NUM_THREADS]
[--progress]
[--recombination-rate RECOMBINATION_RATE]
[--mismatch-ratio MISMATCH_RATIO]
samples augmented_ancestors
Positional Arguments#
- samples
The input sample data in tsinfer ‘samples’ format. Please see the documentation at https://tskit.dev/tsinfer/docs/ for information on how to import data into this format.
- augmented_ancestors
The path to write the augmented ancestors tree sequence to
Named Arguments#
- -n, --num-samples
The number of samples to use. Defaults to 10% of the total.
- -A, --ancestors-trees
The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’
- -v, --verbosity
Increase the verbosity
- --log-section, -L
Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads
Log messages only for the specified module
- --no-path-compression
Disable path compression
- --num-threads, -t
The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
- --progress, -p
Show a progress monitor.
- --recombination-rate
The recombination rate per unit genome
- --mismatch-ratio
The mismatch ratio: measures the relative importance of multiple mutation/error versus recombination during inference. This defaults to unity if a recombination rate or map are specified.
match-samples (ms)#
Matches the samples against the tree sequence structure built by the match-ancestors command
tsinfer match-samples [-h] [-v]
[--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
[-A ANCESTORS_TREES] [--no-path-compression]
[--no-post-process] [-O OUTPUT_TREES]
[--num-threads NUM_THREADS] [--progress]
[--recombination-rate RECOMBINATION_RATE]
[--mismatch-ratio MISMATCH_RATIO]
samples
Positional Arguments#
- samples
The input sample data in tsinfer ‘samples’ format. Please see the documentation at https://tskit.dev/tsinfer/docs/ for information on how to import data into this format.
Named Arguments#
- -v, --verbosity
Increase the verbosity
- --log-section, -L
Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads
Log messages only for the specified module
- -A, --ancestors-trees
The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’
- --no-path-compression
Disable path compression
- --no-post-process, --no-simplify
Do not post process the output tree sequence
- -O, --output-trees
The path to the output trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default output file would be ‘1kg-chr1.trees’
- --num-threads, -t
The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
- --progress, -p
Show a progress monitor.
- --recombination-rate
The recombination rate per unit genome
- --mismatch-ratio
The mismatch ratio: measures the relative importance of multiple mutation/error versus recombination during inference. This defaults to unity if a recombination rate or map are specified.
infer#
Runs the generate-ancestors, match-ancestors and match-samples steps in one go. Not recommended for large inferences.
tsinfer infer [-h] [-v]
[--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
[-O OUTPUT_TREES] [--no-path-compression]
[--num-threads NUM_THREADS] [--progress] [--no-post-process]
[--recombination-rate RECOMBINATION_RATE]
[--mismatch-ratio MISMATCH_RATIO] [--keep-intermediates]
[-a ANCESTORS] [-A ANCESTORS_TREES]
samples
Positional Arguments#
- samples
The input sample data in tsinfer ‘samples’ format. Please see the documentation at https://tskit.dev/tsinfer/docs/ for information on how to import data into this format.
Named Arguments#
- -v, --verbosity
Increase the verbosity
- --log-section, -L
Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads
Log messages only for the specified module
- -O, --output-trees
The path to the output trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default output file would be ‘1kg-chr1.trees’
- --no-path-compression
Disable path compression
- --num-threads, -t
The number of worker threads to use. If < 1, use a simpler unthreaded algorithm (default).
- --progress, -p
Show a progress monitor.
- --no-post-process, --no-simplify
Do not post process the output tree sequence
- --recombination-rate
The recombination rate per unit genome
- --mismatch-ratio
The mismatch ratio: measures the relative importance of multiple mutation/error versus recombination during inference. This defaults to unity if a recombination rate or map are specified.
- --keep-intermediates, -k
Keep the intermediate ancestors and ancestors-tree-sequence files. To override the default locations where these files are saved, use the –ancestors and –ancestors-trees options
- -a, --ancestors
The path to the ancestor data file in tsinfer ‘ancestors’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors’
- -A, --ancestors-trees
The path to the ancestor trees file in tskit ‘.trees’ format. If not specified, this defaults to the input samples file stem with the extension ‘.ancestors.trees’. For example, if ‘1kg-chr1.samples’ is the input file then the default ancestors file would be ‘1kg-chr1.ancestors.trees’
list (ls)#
Show a summary of the specified tsinfer related file.
tsinfer list [-h] [-v]
[--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
[--storage]
path
Positional Arguments#
- path
The tsinfer file to show information about.
Named Arguments#
- -v, --verbosity
Increase the verbosity
- --log-section, -L
Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads
Log messages only for the specified module
- --storage, -s
Show detailed information about data storage.
verify#
Verify that the specified tree sequence and samples files represent the same data
tsinfer verify [-h] [-v]
[--log-section {tsinfer.inference,tsinfer.formats,tsinfer.threads}]
[--progress]
samples tree_sequence
Positional Arguments#
- samples
The input sample data in tsinfer ‘samples’ format. Please see the documentation at https://tskit.dev/tsinfer/docs/ for information on how to import data into this format.
- tree_sequence
The tree sequence to compare with in .trees format.
Named Arguments#
- -v, --verbosity
Increase the verbosity
- --log-section, -L
Possible choices: tsinfer.inference, tsinfer.formats, tsinfer.threads
Log messages only for the specified module
- --progress, -p
Show a progress monitor.