diff --git a/content/tskit.ipynb b/content/tskit.ipynb
index c01b2fc..3d2127b 100644
--- a/content/tskit.ipynb
+++ b/content/tskit.ipynb
@@ -20,7 +20,7 @@
"id": "0969a7e2-ac66-4d98-bf16-af2a5389bd06",
"metadata": {},
"source": [
- "## Loading an ARG in tree sequence format"
+ "## Loading an ARG in \"succinct tree sequence\" format"
]
},
{
@@ -40,7 +40,7 @@
"metadata": {},
"outputs": [],
"source": [
- "ts = tskit.load(\"data/demo.trees\")"
+ "ts = tskit.load(\"data/demo.trees\") # By convention we use `ts` or `arg` for the object name"
]
},
{
@@ -68,7 +68,9 @@
"source": [
"
Note: the
provenances listed above (in the last part of the output) show how this tree sequence was originally generated. In this case, close inspection shows it was initially simulated by the command
sim_ancestry(), provided by
msprime version 1.4.0, then simplified, with mutations finally added using the
msprime sim_mutations() function.
\n",
"\n",
- "Each [node](https://tskit.dev/tskit/docs/stable/data-model.html#node-table) in the tree sequence represents a (haploid) genome. *Sample* nodes are the (usually current-day) genomes whose DNA sequences are known. In this tree sequence there are 80 sample nodes, belonging to 40 (diploid) individuals.\n",
+ "Each [node](https://tskit.dev/tskit/docs/stable/data-model.html#node-table) in the tree sequence represents a (haploid) genome. *Sample* nodes are the (usually current-day) genomes whose DNA sequences are known. Here there are 80 sample nodes, belonging to 40 (diploid) individuals.\n",
+ "\n",
+ "As its name suggests, a tree sequence can be interpreted as a sequence of evolutionary (\"gene\") trees along a genome. This is done by linking the sample nodes to ancestral nodes via *edges*. The table above reveals that there are 6979 edges in this tree sequence, defining 1811 trees. Later in this workbook you'll see how to plot one of the local trees.\n",
"\n",
"## Mutations and genetic variation\n",
"\n",