[0.3.7] - 2021-07-08¶
[0.3.6] - 2021-05-14¶
Mutation.indexwhich were deprecated in 0.2.2 (Sep ‘19) have been removed.
Add direct, copy-free access to the arrays representing the quintuply-linked structure of
left_child_array). Allows performant algorithms over the tree structure using, for example, numba (@jeromekelleher, #1299, #1320).
SVG visualization of a tree sequence can be restricted to displaying between left and right genomic coordinates using the
x_limparameter. The default settings now mean that if the left or right flanks of a tree sequence are entirely empty, these regions will not be plotted in the SVG (@hyanwong, #1288).
[0.3.5] - 2021-03-16¶
SVG visualization plots mutations at the correct time, if it exists, and a y-axis, with label can be drawn. Both x- and y-axes can be plotted on trees as well as tree sequences (@hyanwong,:issue:840, #580, #1236)
SVG visualization now uses squares for sample nodes and red crosses for mutations, with the site/mutation positions marked on the x-axis. Additionally, an x-axis label can be set (@hyanwong,:issue:1155, #1194, #1182, #1213)
File minor version change to support individual parents
[0.3.4] - 2020-12-02¶
Minor bugfix release.
[0.3.3] - 2020-11-27¶
equalsmethod to TreeSequence, TableCollection and each of the tables which provides more flexible equality comparisons, for example, allowing users to ignore metadata or provenance in the comparison (@mufernando, @jeromekelleher, #896, #897, #913, #917).
map_mutationsmethod previously used the Fitch parsimony method, but this does not produce parsimonious results on non-binary trees. We now now use the Hartigan parsimony algorithm, which does (@jeromekelleher, #987, #1030).
The argument to
tskit.loadhas been renamed file from path.
All arguments to
Tree.newick()except precision are now keyword-only.
[0.3.2] - 2020-09-29¶
Change several methods (
Tree()) so most parameters are keyword only, not positional. This allows reordering of parameters, so that deprecated parameters can be moved, and the parameter order in similar functions, e.g.
TreeSequence.simplify()can be made consistent (@hyanwong, #374, #846, #851)
Tree accessor functions (e.g.
ts.at()pass extra parameters such as
sample_indexesto the underlying
root_thresholdcan be specified when calling
ts.trees()(@hyanwong, #847, #848)
[0.3.1] - 2020-09-04¶
[0.3.0] - 2020-08-27¶
Major feature release for metadata schemas, set-like operations, mutation times, SVG drawing improvements and many others.
The default display order for tree visualisations has been changed to
minlex(see below) to stabilise the node ordering and to make trees more readily comparable. The old behaviour is still available with
File system operations such as dump/load now raise an appropriate OSError instead of
tskit.FileFormatError. Loading from an empty file now raises and
Bad tree topologies are detected earlier, so that it is no longer possible to create a
TreeSequenceobject which contains a parent with contradictory children on an interval. Previously an error was thrown when some operation building the trees was attempted (@jeromekelleher, #709).
TableCollection objectno longer implements the iterator protocol. Previously
list(tables)returned a sequence of (table_name, table_instance) tuples. This has been replaced with the more intuitive and future-proof
TreeSequence.tables_dictattributes, which perform the same function (@jeromekelleher, #500, #694).
The arguments to
TreeSequence.variantsmust now be keyword arguments, not positional. This is to support the change from
isolated_as_missingin the arguments to these methods. (@benjeffery, #716, #794)
New methods to perform set operations on TableCollections and TreeSequences.
TableCollection.subsetsubsets and reorders table collections by nodes (@mufernando, @petrelharp, #663, #690).
TableCollection.unionforms the node-wise union of two table collections (@mufernando, @petrelharp, #381 #623).
Mutations now have an optional double-precision floating-point
timecolumn. If not specified, this defaults to a particular
tskit.UNKNOWN_TIME) indicating that the time is unknown. For a tree sequence to be considered valid it must meet new criteria for mutation times, see Mutation requirements. Also added function
TableCollection.compute_mutation_times. Table sorting orders mutations by non-increasing time per-site, which is also a requirement for a valid tree sequence (@benjeffery, #672).
Tables with a metadata column now have a
metadata_schemathat is used to validate and encode metadata that is passed to
add_rowand decode metadata on calls to
tree_sequence.node(j)See Metadata (@benjeffery, #491, #542, #543, #601).
Add classes to SVG drawings to allow easy adjustment and styling, and document the new
tskit.TreeSequence.draw_svg()methods. This also fixes #467 for duplicate SVG entity
ids in Jupyter notebooks (@hyanwong, #555).
Add an optional node traversal order in
tskit.Treethat uses the minimum lexicographic order of leaf nodes visited. This ordering (
"minlex_postorder") adds more determinism because it constraints the order in which children of a node are visited (@brianzhang01, #411).
orderargument to the tree visualisation functions which supports two node orderings:
"tree"(the previous default) and
"minlex"which stabilises the node ordering (making it easier to compare trees). The default node ordering is changed to
"minlex"(@brianzhang01, @jeromekelleher, #389, #566).
Allow sites with missing data to be output by the
haplotypesmethod, by default replacing with
-. Errors are no longer raised for missing data with
isolated_as_missing=True; the error types returned for bad alleles (e.g. multiletter or non-ascii) have also changed from
_tskit.LibraryErrorto TypeError, or ValueError if the missing data character clashes (@hyanwong, #426).
keep_input_rootsoption to simplify which, if enabled, adds edges from the MRCAs of samples in the simplified tree sequence back to the roots in the input tree sequence (@jeromekelleher, #775, #782).
sample_countsfeature has been deprecated and is now ignored. Sample counts are now always computed.
impute_missing_dataargument is deprecated and replaced with
isolated_as_missing. Note that to get the same behaviour
impute_missing_data=Trueshould be replaced with
isolated_as_missing=False. (@benjeffery, #716, #794)
[0.2.3] - 2019-11-22¶
Minor feature release, providing a tree distance metric and various method to manipulate tree sequence data.
[0.2.2] - 2019-09-01¶
Minor bugfix release.
Relaxes overly-strict input requirements on individual location data that caused some SLiM tree sequences to fail loading in version 0.2.1 (see #351).
[0.2.1] - 2019-08-23¶
Major feature release, adding support for population genetic statistics, improved VCF output and many other features.
Note: Version 0.2.0 was skipped because of an error uploading to PyPI which could not be undone.
Genotype arrays returned by
TreeSequence.genotype_matrixhave changed from unsigned 8 bit values to signed 8 bit values to accomodate missing data (see #144 for discussion). Specifically, the dtype of the genotypes arrays have changed from numpy “u8” to “i8”. This should not affect client code in any way unless it specifically depends on the type of the returned numpy array.
The VCF written by the
write_vcfis no longer compatible with previous versions, which had significant shortcomings. Position values are now rounded to the nearest integer by default, REF and ALT values are derived from the actual allelic states (rather than always being A and T). Sample names are now of the form
tsk_jfor sample ID j. Most of the legacy behaviour can be recovered with new options, however.
The positional parameter
mean_descendantsTreeSequence methods has been renamed to
Support for general windowed statistics. Implementations of diversity, divergence, segregating sites, Tajima’s D, Fst, Patterson’s F statistics, Y statistics, trait correlations and covariance, and k-dimensional allele frequency specra (@petrelharp, @jeromekelleher, @molpopgen).
map_ancestorsmethod to TableCollection (user:gtsambos). See #175.
Add support for individuals to VCF output, and fix major issues with output format (@jeromekelleher). Position values are transformed in a much more straightforward manner and output has been generalised substantially. Adds
position_transformarguments. See #286, and issues #2, #30 and #73.
Control height scale in SVG trees using ‘tree_height_scale’ and ‘max_tree_height’ (@hyanwong, @jeromekelleher). See #167, #168. Various other improvements to tree drawing (#235, #241, #242, #252, #259).
keep_intervalsmethod for the TableCollection to allow slicing out of topology from specific intervals (@hyanwong, @andrewkern, @petrelharp, @jeromekelleher). See #225 and #261.
[0.1.5] - 2019-03-27¶
This release removes support for Python 2, adds more flexible tree access and a
tskit command line interface.
More flexible tree API (#121). Adds
TreeSequence.at_indexmethods to find specific trees, and efficient support for backwards traversal using
tskit infoCLI command (#66)
[0.1.4] - 2019-02-01¶
Minor feature update. Using the C API 0.99.1.
[0.1.3] - 2019-01-14¶
Fix missing provenance schema: https://github.com/tskit-dev/tskit/issues/81
[0.1.2] - 2019-01-14¶
Fix memory leak in table collection. https://github.com/tskit-dev/tskit/issues/76
[0.1.1] - 2019-01-11¶
Fixes broken distribution tarball for 0.1.0.
[0.1.0] - 2019-01-11¶
Initial release after separation from msprime 0.6.2. Code that reads tree sequence files and processes them should be able to work without changes.
Removal of the previously deprecated
load_tablesfunctions. All code should change to using corresponding TableCollection methods.
[1.1.0a1] - 2019-01-10¶
Initial alpha version posted to PyPI for bootstrapping.
[0.0.0] - 2019-01-10¶
Initial extraction of tskit code from msprime. Relicense to MIT.
Code copied at hash 29921408661d5fe0b1a82b1ca302a8b87510fd23
[0.99.13] - 2021-07-08¶
[0.99.12] - 2021-05-14¶
[0.99.11] - 2021-03-16¶
tsk_individual_table_add_rowhas an extra arguments
Mutation error codes have changed
[0.99.10] - 2021-01-25¶
Minor bugfix on internal APIs
[0.99.9] - 2021-01-22¶
[0.99.8] - 2020-11-27¶
tsk_table_collection_equalsand table equality methods to allow for more flexible equality criteria (e.g., ignore top-level metadata and schema or provenance tables). Existing code should add an extra final parameter
0to retain the current behaviour (@mufernando, @jeromekelleher, #896, #897, #913, #917).
[0.99.7] - 2020-09-29¶
[0.99.6] - 2020-09-04¶
[0.99.5] - 2020-08-27¶
TSK_KEEP_INPUT_ROOTSoption to simplify which, if enabled, adds edges from the MRCAs of samples in the simplified tree sequence back to the roots in the input tree sequence (@jeromekelleher, #775, #782).
[0.99.4] - 2020-08-12¶
TSK_VERSION_PATCHmacro was incorrectly set to
4for 0.99.3, so both 0.99.4 and 0.99.3 have the same value.
[0.99.3] - 2020-07-27¶
Change genotypes from unsigned to signed to accommodate missing data (see #144 for discussion). This only affects users of the
tsk_vargen_tclass. Genotypes are now stored as int8_t and int16_t types rather than the former unsigned types. The field names in the genotypes union of the
tsk_variant_tstruct returned by
tsk_vargen_nexthave been renamed to
i16accordingly; care should be taken when updating client code to ensure that types are correct. The number of distinct alleles supported by 8 bit genotypes has therefore dropped from 255 to 127, with a similar reduction for 16 bit genotypes.
tsk_vargen_initmethod to take an extra parameter
alleles. To keep the current behaviour, set this parameter to NULL.
Edges can now have metadata. Hence edge methods now take two extra arguments: metadata and metadata length. The file format has also changed to accommodate this, but is backwards compatible. Edge metadata can be disabled for a table collection with the TSK_NO_EDGE_METADATA flag. (@benjeffery, #496, #712)
Migrations can now have metadata. Hence migration methods now take two extra arguments: metadata and metadata length. The file format has also changed to accommodate this, but is backwards compatible. (@benjeffery, #505)
Bad tree topologies are detected earlier, so that it is no longer possible to create a tsk_treeseq_t object which contains a parent with contradictory children on an interval. Previously an error occured when some operation building the trees was attempted (@jeromekelleher, #709).
New methods to perform set operations on table collections.
tsk_table_collection_subsetsubsets and reorders table collections by nodes (@mufernando, @petrelharp, #663, #690).
tsk_table_collection_unionforms the node-wise union of two table collections (@mufernando, @petrelharp, #381, #623).
Mutations now have an optional double-precision floating-point
timecolumn. If not specified, this defaults to a particular NaN value (
TSK_UNKNOWN_TIME) indicating that the time is unknown. For a tree sequence to be considered valid it must meet new criteria for mutation times, see Mutation requirements. Add
tsk_table_collection_compute_mutation_timesand new flag to
TSK_CHECK_MUTATION_TIME. Table sorting orders mutations by non-increasing time per-site, which is also a requirement for a valid tree sequence. (@benjeffery, #672)
metadata_schemafields to table collection, with accessors on tree sequence. These store arbitrary bytes and are optional in the file format. (:user: benjeffery, #641)
set_root_thresholdoption to tsk_tree_t which allows us to set the number of samples a node must be an ancestor of to be considered a root (#462).
Change the semantics of tsk_tree_t so that sample counts are always computed, and add a new
TSK_NO_SAMPLE_COUNTSoption to turn this off (#462).
TSK_SAMPLE_COUNTSoptions is now ignored and will print out a warning if used (#462).
[0.99.2] - 2019-03-27¶
Bugfix release. Changes:
Fix incorrect errors on tbl_collection_dump (#132)
Catch table overflows (#157)
[0.99.1] - 2019-01-24¶
Refinements to the C API as we move towards 1.0.0. Changes:
_table_to improve readability. Hence, we now have, e.g.,
Standardise public API to use
tsk_flags_ttypedef and consistently use this as the type used to encode bitwise flags. To avoid confusion, functions now have an
tsk_table_collection_sortto take a bookmark as start argument.
Relax restriction that nodes in the
samplesargument to simplify must currently be marked as samples. (https://github.com/tskit-dev/tskit/issues/72)
tsk_table_collection_simplifyto take a NULL samples argument to specify “all samples in the current tables”.
Add support for building as a meson subproject.
[0.99.0] - 2019-01-14¶
Initial alpha version of the tskit C API tagged. Version 0.99.x represents the series of releases leading to version 1.0.0 which will be the first stable release. After 1.0.0, semver rules regarding API/ABI breakage will apply; however, in the 0.99.x series arbitrary changes may happen.
[0.0.0] - 2019-01-10¶
Initial extraction of tskit code from msprime. Relicense to MIT. Code copied at hash 29921408661d5fe0b1a82b1ca302a8b87510fd23