Introduction

Introduction#

This is the documentation for tskit, the tree sequence toolkit. Succinct tree sequences provide a highly efficient way of storing a set of related DNA sequences by encoding their ancestral history as a set of correlated trees along the genome. The evolutionary history of genetic sequences is often technically referred to as an Ancestral Recombination Graph (ARG); succinct tree sequences are fully compatible with this formulation, and tskit is a therefore a powerful platform for processing ARGs.

The tree sequence format is output by a number of external software libraries and programs (such as msprime, SLiM, fwdpp, and tsinfer) that either simulate or infer the evolutionary history of genetic sequences. This library provides the underlying functionality that such software uses to load, examine, and manipulate tree sequences, including efficient methods for calculating genetic statistics.

For a gentle introduction, you might like to read “What is a tree sequence?” on our tutorials site. There you can also find further tutorial material to introduce you to the key concepts behind succinct tree sequences.

Note

This documentation is under active development and may be incomplete in some areas. If you would like to help improve it, please open an issue or pull request at GitHub.