{
"cells": [
{
"cell_type": "markdown",
"id": "20528963",
"metadata": {},
"source": [
"(sec_legacy_0x)=\n",
"# Legacy (version 0.x) APIs\n",
"\n",
"Msprime 1.0 involved major changes to the internal details of\n",
"how simulations are performed and the addition of a new set of\n",
"APIs, which solve some long-standing issues and provide\n",
"more sensible defaults.\n",
"\n",
"**Short version:**\n",
"\n",
"- Your old simulations will still work and the msprime 0.x APIs will\n",
"be supported **indefinitely**. We're not going to break your code.\n",
"The only situation in which your simulation might break is if you've\n",
"been doing some obscure things with genome discretisation. Please\n",
"see {ref}`this section` for\n",
"more information.\n",
"\n",
"- The new APIs are **much better**, and support several features\n",
"that are not available for the legacy API. In general, new features\n",
"will only be added to the 1.x APIs and 0.x legacy APIs are\n",
"in **maintenance mode**.\n",
"\n",
"## Upgrading code\n",
"\n",
"This section is to help legacy 0.x users of msprime get up to speed quickly, summarising\n",
"the new APIs and their main differences to the 0.x versions.\n",
"\n",
"The main change is that there are two new functions, {func}`.sim_ancestry` and\n",
"{func}`.sim_mutations` which correspond to the 0.x functions {func}`.simulate`\n",
"and {func}`.mutate`. The 0.x functions are **deprecated** but **will continue\n",
"to be supported indefinitely**.\n",
"\n",
"One major change is that {ref}`ancestry` and\n",
"{ref}`mutations` must be simulated **separately**.\n",
"See the {ref}`sec_randomness_replication_mutations` section for\n",
"idiomatic examples of how to do this efficiently.\n",
"\n",
"### Ancestry\n",
"\n",
"The new {func}`.sim_ancestry` function replaces the 0.x {func}`.simulate`\n",
"function and is very similar. See the {ref}`sec_ancestry` page for\n",
"details and extensive examples using this function.\n",
"\n",
"There are some important differences between {func}`.simulate`\n",
"and {func}`.sim_ancestry`:\n",
"\n",
"* The `samples` parameter now refers to the **number of individuals**\n",
" rather than **the number of nodes** (i.e. monoploid genomes).\n",
" Because the default {ref}`ploidy `\n",
" is 2 (see the next point) the upshot is that `sim_ancestry(2)` will\n",
" result in a tree sequence with *four* sample nodes, not two. (It is\n",
" possible to override this behaviour using the list of {class}`.SampleSet`\n",
" objects parameter to `samples`.)\n",
"* The `Ne` parameter in 0.x {func}`.simulate` function has been replaced\n",
" with the `population_size` parameter.\n",
"* There is now a {ref}`sec_ancestry_ploidy` parameter, which has\n",
" two effects:\n",
"\n",
" 1. Sets the default number of sample nodes per *individual*\n",
" 2. Changes the {ref}`timescale`\n",
" over which coalescence occurs. By default `ploidy` is 2 and\n",
" so mean time to common ancestor in a population of size `N` is `2N` generations,\n",
" which is the same as msprime 0.x.\n",
"* Rather than two parameters `num_samples` and `samples`, the\n",
" {func}`.sim_ancestry` function has a single parameter `samples` which\n",
" has different behaviour depending on the type of parameters provided.\n",
" See {ref}`sec_ancestry_samples` for details.\n",
" Note in particular that a list of `Sample` objects is **not** supported.\n",
"* Similarly, there is now one parameter `recombination_rate` which can\n",
" be either a single value or a {class}`.RateMap` object. Note that the\n",
" 0.x {class}`.RecombinationMap` is deprecated and not supported as input\n",
" to {func}`.sim_ancestry`. See {ref}`sec_ancestry_recombination` for more\n",
" details.\n",
"* Simulations are performed on a **discrete** genome by default. To get the\n",
" 0.x behaviour of a continuous genome, set `discrete_genome=False`.\n",
" See the {ref}`sec_ancestry_discrete_genome` section for more details.\n",
"* The `from_ts` parameter used has been renamed to `initial_state` and\n",
" accepts either a {class}`tskit.TableCollection` or {class}`tskit.TreeSequence`\n",
" parameter. See the {ref}`sec_ancestry_initial_state` section for details.\n",
"* There is **no** `mutation_rate` parameter to {func}`.sim_ancestry`: use\n",
" {func}`.sim_mutations` instead.\n",
"* The `population_configurations`, `migration_matrix` and `demographic_events`\n",
" parameters have been replace with a single parameter `demography`, which must take\n",
" a {class}`.Demography` instance. (See the next section for more details.)\n",
"\n",
"### Demography\n",
"\n",
"A new {class}`.Demography` object has been added for version 1.0 which\n",
"encapsulates the functionality needed to define and debug demographic models\n",
"in msprime. Demographic models can only be specified to `sim_ancestry`\n",
"using an instance of this class.\n",
"\n",
"See the {ref}`sec_demography` section for detailed documentation on this\n",
"new interface.\n",
"\n",
"* It is easy to create a {class}`.Demography` from the 0.x\n",
"`population_configurations`, `migration_matrix` and `demographic_events`\n",
"values using the {meth}`.Demography.from_old_style` method.\n",
"\n",
"* The {class}`.DemographyDebugger` class should no longer be instantiated\n",
"directly; instead use the {meth}`.Demography.debug` method.\n",
"\n",
"(sec_legacy_0x_genome_discretisation)=\n",
"### Genome discretisation\n",
"\n",
"In msprime 0.x, recombination was implemented internally using a discrete\n",
"number of genetic loci. That is, the simulation was performed in\n",
"*genetic* coordinates, which were then mapped back to *physical* coordinates\n",
"at the end of simulation. This had the significant advantage that\n",
"recombination could be implemented during the simulation as a uniform process\n",
"over these discrete loci. However, it led to\n",
"many numerical issues encountered when mapping back and\n",
"forth between genetic and physical coordinates as well as limiting\n",
"what could be done in terms of gene conversion and other processes.\n",
"We therefore changed to using physical coordinates throughout the simulation\n",
"for msprime 1.0.\n",
"\n",
"The simulations in 0.x and 1.x are almost entirely compatible and everything\n",
"should work as expected when running 0.x code on msprime 1.0 or later. However,\n",
"there is one (hopefully obscure) case in which code written for msprime 0.x\n",
"will no longer work.\n",
"\n",
"The ``num_loci`` argument to the 0.x class {class}`.RecombinationMap`\n",
"was used to control the number of discrete genetic loci in the simulation. By\n",
"default, this was set to a large number ({math}`\\sim 2^{32}`), effectively\n",
"giving a continuous coordinate space when mapped back into physical units.\n",
"By setting the ``num_loci`` equal\n",
"to the sequence length of the RecombinationMap, we could also specify\n",
"discrete physical loci. Specifying whether we simulate in discrete or continuous\n",
"genome coordinates is now done using the ``discrete_genome`` argument\n",
"to {func}`.sim_ancestry` (see the {ref}`sec_ancestry_discrete_genome`\n",
"section for more details). The {class}`.RateMap` class is now used to\n",
"specify varying rates of recombination along the genome and no longer\n",
"has any concept of genetic \"loci\" --- the choice of coordinate space\n",
"is now decoupled from our specification of the recombination process.\n",
"\n",
"Both the cases of discrete and fully continuous genomes are well\n",
"supported in msprime 1.x and so nearly all existing code\n",
"should continue to work as expected.\n",
"What is no longer supported is specifying the \"granularity\" of the\n",
"continuous space via the ``num_loci`` parameter, and if we try\n",
"to set ``num_loci`` to anything other than the sequence length\n",
"we get an error:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "a09980ee",
"metadata": {
"tags": [
"raises-exception"
]
},
"outputs": [
{
"ename": "ValueError",
"evalue": "The RecombinationMap interface is deprecated and only partially supported. If you wish to simulate a number of discrete loci, you must set num_loci == the sequence length. If you wish to simulate recombination process on as fine a map as possible, please omit the num_loci parameter (or set to None). Otherwise, num_loci is no longer supported and the behaviour of msprime 0.x cannot be emulated. Please consider upgrading your code to the version 1.x APIs.",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"Input \u001b[0;32mIn [1]\u001b[0m, in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mmsprime\u001b[39;00m\n\u001b[1;32m 3\u001b[0m \u001b[38;5;66;03m# Here we try to make a sequence length of 10 with 5 discrete loci\u001b[39;00m\n\u001b[0;32m----> 4\u001b[0m recomb_map \u001b[38;5;241m=\u001b[39m \u001b[43mmsprime\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mRecombinationMap\u001b[49m\u001b[43m(\u001b[49m\u001b[43mpositions\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m10\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mrates\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0.1\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnum_loci\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m5\u001b[39;49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/work/tskit-site/tskit-site/msprime/intervals.py:649\u001b[0m, in \u001b[0;36mRecombinationMap.__init__\u001b[0;34m(self, positions, rates, num_loci, map_start)\u001b[0m\n\u001b[1;32m 647\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_is_discrete \u001b[38;5;241m=\u001b[39m num_loci \u001b[38;5;241m==\u001b[39m positions[\u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1\u001b[39m]\n\u001b[1;32m 648\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m num_loci \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m num_loci \u001b[38;5;241m!=\u001b[39m positions[\u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1\u001b[39m]:\n\u001b[0;32m--> 649\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m 650\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mThe RecombinationMap interface is deprecated and only \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 651\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpartially supported. If you wish to simulate a number of \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 652\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdiscrete loci, you must set num_loci == the sequence length. \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 653\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mIf you wish to simulate recombination process on as fine \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 654\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124ma map as possible, please omit the num_loci parameter (or set \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 655\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mto None). Otherwise, num_loci is no longer supported and \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 656\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mthe behaviour of msprime 0.x cannot be emulated. Please \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 657\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mconsider upgrading your code to the version 1.x APIs.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 658\u001b[0m )\n\u001b[1;32m 659\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mmap \u001b[38;5;241m=\u001b[39m RateMap(position\u001b[38;5;241m=\u001b[39mpositions, rate\u001b[38;5;241m=\u001b[39mrates[:\u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1\u001b[39m])\n",
"\u001b[0;31mValueError\u001b[0m: The RecombinationMap interface is deprecated and only partially supported. If you wish to simulate a number of discrete loci, you must set num_loci == the sequence length. If you wish to simulate recombination process on as fine a map as possible, please omit the num_loci parameter (or set to None). Otherwise, num_loci is no longer supported and the behaviour of msprime 0.x cannot be emulated. Please consider upgrading your code to the version 1.x APIs."
]
}
],
"source": [
"import msprime\n",
"\n",
"# Here we try to make a sequence length of 10 with 5 discrete loci\n",
"recomb_map = msprime.RecombinationMap(positions=[0, 10], rates=[0.1, 0], num_loci=5)"
]
},
{
"cell_type": "markdown",
"id": "a3f52127",
"metadata": {},
"source": [
"If you get this error, please check whether specifying a\n",
"number of loci like this was actually what you intended. Almost\n",
"certainly you actually wanted to simulate a continuous genome\n",
"(omit the ``num_loci`` parameter) or a discrete genome\n",
"with the breaks occurring integer boundaries (set ``num_loci``\n",
"equal to the sequence length).\n",
"\n",
"If not, please let us know your use case and we may be able\n",
"to accommodate it in the new code. Until then, you will need\n",
"to downgrade msprime to 0.7.x for your simulations to run.\n",
"\n",
"### Mutations\n",
"\n",
"Msprime 1.0 provides powerful new methods for simulating mutational\n",
"processes, adding support for finite-sites mutations and a\n",
"range of different {ref}`mutation models`.\n",
"Similarly to the approach for ancestry simulations, we introduce\n",
"a new function {func}`.sim_mutations` which allows us to provide\n",
"new, more appropriate defaults while still supporting older code.\n",
"\n",
"Differences between the 1.x {func}`.sim_mutations` and 0.x {func}`.mutate`\n",
"functions:\n",
"\n",
"* The {func}`.sim_mutations` function works on a **discrete** genome by default.\n",
"\n",
"* There are now also many new mutation models supported by {func}`.sim_mutations`;\n",
" see {ref}`sec_mutations` for details. These are *not* supported in the deprecated\n",
" {func}`.mutate` function.\n",
"\n",
"* The simulated mutations now have a simulated ``time`` value, which specifies the\n",
" precise time that the mutation occurred. Note that this value is also provided in the\n",
" returned tables for the deprecated ``simulate()`` and ``mutate()`` functions,\n",
" which may lead to some compatibility issues. (We considered removing the simulated\n",
" mutation times for these 0.x functions for strict compatibility, but this would\n",
" have broken any code using the ``keep`` option in mutate.)\n",
"\n",
"\n",
"## API Reference\n",
"\n",
"\n",
"### Ancestry\n",
"\n",
"```{eval-rst}\n",
".. autofunction:: msprime.simulate()\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.PopulationConfiguration\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.PopulationParametersChange\n",
"\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.MigrationRateChange\n",
"\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.MassMigration\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.CensusEvent\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.Sample\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.SimulationModelChange\n",
"```\n",
"\n",
"### Recombination maps\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.RecombinationMap\n",
" :members:\n",
"```\n",
"\n",
"### Mutations\n",
"\n",
"```{eval-rst}\n",
".. autofunction:: msprime.mutate\n",
"```\n",
"\n",
"```{eval-rst}\n",
".. autoclass:: msprime.InfiniteSites\n",
"```"
]
}
],
"metadata": {
"jupytext": {
"text_representation": {
"extension": ".md",
"format_name": "myst",
"format_version": 0.12,
"jupytext_version": "1.9.1"
}
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
},
"source_map": [
12,
159,
165
]
},
"nbformat": 4,
"nbformat_minor": 5
} |