Introduction to tstrait#
Welcome to tstrait, a quantitative trait simulator based on
succinct tree sequences. Tstrait supports simulation of quantitative
traits under various additive models, with effect sizes taken from specified probability distribution
including, but not limited to, the normal, t-, or Gamma distributions. Details of supported
distributions are in (Effect Size Distribution). It also supports
multi-trait simulation under pleiotropy (see Multi-trait simulation for details). For ease of use, the
output of tstrait is given as a pandas.DataFrame
.
Advantages of tstrait#
Tstrait is built on top of tskit, and uses tree sequence data as an input. Tree sequences are designed to efficiently store and process millions of DNA sequences. As a result, tstrait can simulate quantitative traits substantially faster than working with the genotype matrix or other traditional data structures.
Quantitative trait simulation in tstrait is transparent, and users can control each step in the simulation. Thus, it is possible for the users to simulate their own environmental noise on top of simulated genetic values, or even use their own defined effect sizes and causal sites. The tree sequence data structure is widely used in various population genetic simulation packages, including SLiM, msprime, and stdpopsim; it is therefore easy for users of these packages to add quantitative traits to their results using tstrait.
Tree Sequence resources#
To learn more about tree sequences:
The tskit website provides learning materials explaining what tree sequences are, and includes tutorials, publications and videos.
The PySLiM manual explains how forward genetic simulation can be create tree sequences.
The msprime manual details an efficient backward-time genetic simulator that outputs tree sequences.
The tskit tutorials explain how to analyze succinct tree sequences by using tskit.