Introduction

Introduction#

This is the documentation for tskit, the tree sequence toolkit. Succinct tree sequences are an efficient way of representing the genetic history - often technically referred to as an Ancestral Recombination Graph or ARG - of a set of DNA sequences.

The tree sequence format is output by a number of external software libraries and programs (such as msprime, SLiM, fwdpp, and tsinfer) that either simulate or infer the evolutionary ancestry of genetic sequences. This library provides the underlying functionality that such software uses to load, examine, and manipulate ARGs in tree sequence format, including efficient access to the correlated sequence of trees along a genome and general methods to calculate genetic statistics.

For a gentle introduction, you might like to read “What is a tree sequence?” on our tutorials site. There you can also find further tutorial material to introduce you to key tskit concepts.

Important

If you use tskit in your work, please remember to cite it appropriately: see the citations page for details.

Note

This documentation is under active development and may be incomplete in some areas. If you would like to help improve it, please open an issue or pull request on GitHub.