Welcome!#
This site contains a number of tutorials to develop your understanding of
genetic genealogies, ancestral recombination graphs, and the
succinct tree sequence storage format,
as implemented in tskit
: the tree sequence toolkit.
Also included are a number of tutorials showing advanced use of
software programs,
such as msprime
, that form part of the
tskit
ecosystem.
If you are new to the world of tree sequences, we suggest you start with the first tutorial: What is a tree sequence?
Note
Tutorials are under constant development. Those that are still a work in progress and not yet ready for use are shown in italics in the list of tutorials.
We very much welcome help developing existing tutorials or writing new ones. Please open or contribute to a GitHub issue if you would like to help out.
Other sources of help#
In addition to these tutorials, our Learn page lists selected videos and publications to help you learn about tree sequences.
We aim to be a friendly, welcoming open source community. Questions and discussion about using tskit, the tree sequence toolkit should be directed to the GitHub discussion forum, and there are similar forums for other software in the tree sequence development community, such as for msprime and tsinfer.
Running tutorial code#
It is possible to run the tutorial code on your own computer, if you wish.
This will allow you to experiment with the examples provided.
The recommended way to do this is from within a
Jupyter notebook. As well as installing Jupyter, you will also
need to install the various Python libraries, most importantly
tskit
, msprime
, numpy
, and matplotlib
. These and other packages are
listed in the requirements.txt
file; a shortcut to installing the necessary software is therefore:
python3 -m pip install -r https://tskit.dev/tutorials/requirements.txt
In addition, to run the R tutorial you will need to install the R reticulate library, and if running it in a Jupyter notebook, the IRkernel library. This can be done by running the following command within R:
install.packages(c("reticulate", "IRkernel")); IRkernel::installspec()
Downloading tutorial datafiles#
Many of the tutorials use pre-existing tree sequences stored in the
data
directory.
These can be downloaded individually from that link, or you can
download them all at once by running the script stored in
https://tskit.dev/tutorials/examples/download.py.
If you are running the code in the tutorials from within a Jupyter notebook
then you can simply load this code into a new cell by using the
%load cell magic.
Just run the following in a Jupyter code cell:
%load https://tskit.dev/tutorials/examples/download.py
Running the resulting Python code should download the data files, then print out
finished downloading
when all files are downloaded. You should then be able
to successfully run code such as the following:
import tskit
ts = tskit.load("data/basics.trees")
print(f"The file 'data/basics.trees' exists, and contains {ts.num_trees} trees")
The file 'data/basics.trees' exists, and contains 3 trees