Welcome!

This site contains a number of tutorials to develop your understanding of tree sequences and software programs, such as msprime, that use them.

If you are new to the world of tree sequences, we suggest you start with the first tutorial: What is a tree sequence?

Note

Tutorials are under constant development. Those that are still a work in progress and not yet ready for use are shown in italics in the list of tutorials.

We very much welcome help developing existing tutorials or writing new ones. Please open or contribute to a GitHub issue if you would like to help out.

Other sources of help

In addition to these tutorials, our Learn page lists selected videos and publications to help you learn about tree sequences.

We aim to be a friendly, welcoming open source community. Questions and discussion about using tskit, the tree sequence toolkit should be directed to the GitHub discussion forum, and there are similar forums for other software in the tree sequence development community, such as for msprime and tsinfer.

Running tutorial code

It is possible to run the tutorial code on your own computer, if you wish. This will allow you to experiment with the examples provided. The recommended way to do this is from within a Jupyter notebook. As well as installing Jupyter, you will also need to install the various Python libraries, most importantly tskit, msprime, numpy, and matplotlib. These and other packages are listed in the requirements.txt file; a shortcut to installing the necessary software is therefore:

python3 -m pip install -r https://tskit.dev/tutorials/requirements.txt

In addition, to run the R tutorial you will need to install the R reticulate library, and if running in a Jupyter, the IRkernel library. This can be done by running the following command within R:

install.packages(c("reticulate", "IRkernel")); IRkernel::installspec()

Downloading tutorial datafiles

Many of the tutorials use pre-existing tree sequences stored in the data directory. These can be downloaded from the internet by running the script stored in https://tskit.dev/tutorials/examples/download.py. If you are running the code in the tutorials from within a Jupyter notebook then you can simply load this code into a new cell by using the %load cell magic. Just run the following in a Jupyter code cell:

%load https://tskit.dev/tutorials/examples/download.py

Running the resulting Python code should download the data files, then print out finished downloading when all files are downloaded. You should then be able to successfully run code such as the following:

import tskit
ts = tskit.load("data/basics.trees")
print(f"The file 'data/basics.trees' exists, and contains {ts.num_trees} trees")
The file 'data/basics.trees' exists, and contains 3 trees