Python API#
This page provides detailed documentation for the tszip
Python API.
Usage example#
Tszip can be used directly in Python to provide seamless compression and decompression of tree sequences files. Here, we run an msprime simulation and write the output to a .trees.tsz
file:
import msprime
import tszip
ts = msprime.simulate(10, random_seed=1)
tszip.compress(ts, "simulation.trees.tsz")
# Later, we load the same tree sequence from the compressed file.
ts = tszip.decompress("simulation.trees.tsz")
# Or use open, which works for both compressed and uncompressed files.
ts = tszip.load("simulation.trees.tsz")
Note
For very small simulations like this example, the tszip file may be larger than the original uncompressed file.
API#
- tszip.load(path)[source]#
Open a tszip or normal tskit file. This is a convenience function that determines if the file needs to be decompressed or not, returning the tree sequence instance in either case.
- Parameters:
path (str) – The location of the tszip compressed file or standard tskit file to load.
- Return type:
- Returns:
A
tskit.TreeSequence
instance corresponding to the specified file.
- tszip.compress(ts, destination, variants_only=False)[source]#
Compresses the specified tree sequence and writes it to the specified path or file-like object. By default, fully lossless compression is used so that tree sequences are identical before and after compression. By specifying the
variants_only
option, a lossy compression can be used, which discards any information that is not needed to represent the variants (which are stored losslessly).- Parameters:
ts (tskit.TreeSequence) – The input tree sequence.
destination (str) – The string,
pathlib.Path
or file-like object we should write the compressed file to.variants_only (bool) – If True, discard all information not necessary to represent the variants in the input file.
- tszip.decompress(path)[source]#
Decompresses the tszip compressed file and returns a tskit tree sequence instance.
- Parameters:
path (str) – The location of the tszip compressed file to load.
- Return type:
- Returns:
A
tskit.TreeSequence
instance corresponding to the the specified file.