File formats#
tsinfer
uses the excellent zarr library
to encode data in a form that is both compact and efficient to process.
See the API documentation for details on
how to construct and manipulate these files using Python. The
tsinfer list command provides a way to print out a
summary of these files.
Ancestors File#
The ancestors file contains the ancestral haplotype data inferred from the sample data in the Generate ancestors step.
Todo
Document the structure of the ancestors file.
Tree sequences#
The goal of tsinfer
is to infer correlated genealogies from variation
data, and it uses the very efficient succinct tree sequence data structure
to encode this output. Please see the tskit documentation for details on how to
process and manipulate such tree sequences.
The intermediate .ancestors.trees
file produced by the
Match ancestors step is also a
tree sequence and can be loaded and analysed using the
tskit API.