Command line interface#

The tsinfer command line interface runs the inference pipeline using a TOML configuration file. See the quickstart for an introduction and the config reference for all available options.

$ tsinfer run config.toml --threads 4 -v

tsinfer#

tsinfer: infer tree sequences from genetic variation data.

Usage

tsinfer [OPTIONS] COMMAND [ARGS]...

Options

--version#

Show the version and exit.

augment-sites#

Place non-inference sites onto a tree sequence using parsimony.

Usage

tsinfer augment-sites [OPTIONS] CONFIG

Options

--input <input_ts>#

Required Input tree sequence file (output of match or post-process).

--output <output_path>#

Required Output tree sequence file.

-t, --threads <threads>#

Worker threads.

Default:

1

-f, --force#

Overwrite existing output files.

-p, --progress#

Show per-step progress bars.

-v, --verbose#

Increase log verbosity.

-l, --log-file <log_file>#

Write log messages to this file instead of stderr.

Arguments

CONFIG#

Required argument

config#

Utilities for inspecting and validating the config file.

Usage

tsinfer config [OPTIONS] COMMAND [ARGS]...

check#

Validate the config and verify all input paths exist.

Usage

tsinfer config check [OPTIONS] CONFIG

Arguments

CONFIG#

Required argument

show#

Print the resolved config with defaults filled in.

Usage

tsinfer config show [OPTIONS] CONFIG

Arguments

CONFIG#

Required argument

infer-ancestors#

Build the ancestor VCZ store from the samples VCZ.

Usage

tsinfer infer-ancestors [OPTIONS] CONFIG

Options

-t, --threads <threads>#

Worker threads.

Default:

1

-f, --force#

Overwrite existing output files.

-p, --progress#

Show per-step progress bars.

-v, --verbose#

Increase log verbosity.

-l, --log-file <log_file>#

Write log messages to this file instead of stderr.

-w, --write-threads <write_threads>#

Writer threads for Zarr I/O.

Default:

2

--genotype-encoding <genotype_encoding>#

Genotype storage encoding. one-bit saves memory; eight-bit is faster but uses ~8x more. eight-bit is required when genotypes contain missing data.

Default:

'eight-bit'

Options:

eight-bit | one-bit

Arguments

CONFIG#

Required argument

match#

Run the unified match loop (ancestors + samples).

Usage

tsinfer match [OPTIONS] CONFIG

Options

--workdir <workdir>#

Directory for checkpoints; enables resume on restart.

--keep-intermediates#

Keep all intermediate .trees files in workdir.

-c, --cache-size <cache_size>#

Genotype chunk cache size in MiB.

Default:

256

--group-stop <group_stop>#

Stop before this group index (0-indexed, like range()). e.g. –group-stop 2 processes groups 0 and 1. Requires –workdir for useful resume behavior.

--read-workers <read_workers>#

Background threads for loading genotype chunks. [default: threads/2, minimum 1]

--match-file <match_file>#

Write per-haplotype match results as JSON lines to this file.

-t, --threads <threads>#

Worker threads.

Default:

1

-f, --force#

Overwrite existing output files.

-p, --progress#

Show per-step progress bars.

-v, --verbose#

Increase log verbosity.

-l, --log-file <log_file>#

Write log messages to this file instead of stderr.

Arguments

CONFIG#

Required argument

post-process#

Post-process a matched tree sequence.

Usage

tsinfer post-process [OPTIONS] CONFIG

Options

--input <input_ts>#

Required Input tree sequence file.

-t, --threads <threads>#

Worker threads.

Default:

1

-f, --force#

Overwrite existing output files.

-p, --progress#

Show per-step progress bars.

-v, --verbose#

Increase log verbosity.

-l, --log-file <log_file>#

Write log messages to this file instead of stderr.

Arguments

CONFIG#

Required argument

run#

Run the full pipeline: infer-ancestors, match, post-process, augment-sites.

Usage

tsinfer run [OPTIONS] CONFIG

Options

-c, --cache-size <cache_size>#

Genotype chunk cache size in MiB.

Default:

256

--genotype-encoding <genotype_encoding>#

Genotype storage encoding. one-bit saves memory; eight-bit is faster but uses ~8x more. eight-bit is required when genotypes contain missing data.

Default:

'eight-bit'

Options:

eight-bit | one-bit

--read-workers <read_workers>#

Background threads for loading genotype chunks. [default: threads/2, minimum 1]

--match-file <match_file>#

Write per-haplotype match results as JSON lines to this file.

-t, --threads <threads>#

Worker threads.

Default:

1

-f, --force#

Overwrite existing output files.

-p, --progress#

Show per-step progress bars.

-v, --verbose#

Increase log verbosity.

-l, --log-file <log_file>#

Write log messages to this file instead of stderr.

Arguments

CONFIG#

Required argument

show-match-jobs#

Show a histogram of match-jobs group sizes.

Usage

tsinfer show-match-jobs [OPTIONS] JSON_FILE

Arguments

JSON_FILE#

Required argument