rtbs-dev · rtbs-dev · Feb 4, 2026 · Feb 5, 2026 · Feb 5, 2026 · Feb 5, 2026
@@ -11,3 +11,4 @@ wheels/
 .hypothesis
 src/contingency/__coconut_cache__
 site
+examples/.ipynb_checkpoints
@@ -1,5 +1,5 @@
 stages:
-  - pages
+  - install_and_deploy
 
 variables:
   UV_VERSION: "0.9.28"
@@ -9,13 +9,33 @@ variables:
   # so we need to copy instead of using hard links.
   UV_LINK_MODE: copy
 
-zensical:
+uv-setup:
+  stage: install_and_deploy
   image: ghcr.io/astral-sh/uv:$UV_VERSION-python$PYTHON_VERSION-$BASE_LAYER
-  stage: pages
+  variables:
+    UV_CACHE_DIR: .uv-cache
+  cache:
+    - key:
+        files:
+          - uv.lock
+      paths:
+        - $UV_CACHE_DIR
+
   before_script:
     - apk add g++ build-base linux-headers
   script:
     - uv sync
+    - uv cache prune --ci
+    # pytest:
+    #   stage: install_and_deploy
+    #   needs: ["uv-setup"]
+    #   script:
+    - uv run pytest "tests/test_contingency.py"
+
+    # zensical:
+    #   stage: install_and_deploy
+    #   needs: ["uv-setup"]
+    #   script:
     - uv run zensical build
     # - mv site public
   artifacts:
@@ -26,3 +46,4 @@ zensical:
     publish: site
   rules:
     - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
+    - if: $CI_PIPELINE_SOURCE == "merge_request_event" && $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == $CI_DEFAULT_BRANCH
@@ -1,7 +1,9 @@
+---
+title: Contingent
+---
 
 
-
-::: contingency.Contingent
+::: contingency.contingent
     handler: python
     options:
       show_root_heading: true

@@ -3,3 +3,7 @@ title: Plotting Utilities
 ---
 
 ::: contingency.plots
+    handler: python
+    options:
+      show_root_heading: true
+
@@ -3,4 +3,53 @@ div.doc-contents:not(.first) {
   padding-left: 25px;
   border-left: 4px solid rgba(230, 230, 230);
   margin-bottom: 80px;
+}
+
+
+/* Tree-like output for backlinks. */
+.doc-backlink-list {
+  --tree-clr: var(--md-default-fg-color);
+  --tree-font-size: 1rem;
+  --tree-item-height: 1;
+  --tree-offset: 1rem;
+  --tree-thickness: 1px;
+  --tree-style: solid;
+  display: grid;
+  list-style: none !important;
+}
+
+.doc-backlink-list li>span:first-child {
+  text-indent: .3rem;
+}
+
+.doc-backlink-list li {
+  padding-inline-start: var(--tree-offset);
+  border-left: var(--tree-thickness) var(--tree-style) var(--tree-clr);
+  position: relative;
+  margin-left: 0 !important;
+
+  &:last-child {
+    border-color: transparent;
+  }
+
+  &::before {
+    content: '';
+    position: absolute;
+    top: calc(var(--tree-item-height) / 2 * -1 * var(--tree-font-size) + var(--tree-thickness));
+    left: calc(var(--tree-thickness) * -1);
+    width: calc(var(--tree-offset) + var(--tree-thickness) * 2);
+    height: calc(var(--tree-item-height) * var(--tree-font-size));
+    border-left: var(--tree-thickness) var(--tree-style) var(--tree-clr);
+    border-bottom: var(--tree-thickness) var(--tree-style) var(--tree-clr);
+  }
+
+  &::after {
+    content: '';
+    position: absolute;
+    border-radius: 50%;
+    background-color: var(--tree-clr);
+    top: calc(var(--tree-item-height) / 2 * 1rem);
+    left: var(--tree-offset);
+    translate: calc(var(--tree-thickness) * -1) calc(var(--tree-thickness) * -1);
+  }
 }
@@ -6,22 +6,23 @@ icon: lucide/trending-up
 When datasets become increasingly large, the number of unique thresholds can grow significantly. 
 
 ## Vectorize & Memoize
-Because looping in python is slow, we rely on boolean matrix operations to calculate the contingency counts. At the core of `Contingent.from_scalar` is a call to `numpy.less_equal.outer()`, which broadcasts the thresholding operation over all possible levels simultaneously. 
+Because looping in python is slow, we rely on boolean matrix operations to calculate the contingency counts. At the core of [`Contingent.from_scalar`][contingency.contingent.Contingent.from_scalar] is a call to [`numpy.less_equal.outer`](https://numpy.org/doc/stable/reference/generated/numpy.ufunc.outer.html), which broadcasts the thresholding operation over all possible levels simultaneously. 
 
 This is reasonably fast, able to calculate e.g. APS only marginally slower than the scikit-learn implementation.
 In addition, the one-time cacluation of the "full" contingency set has the added benefit of amortizing the cost of subsequent metric calculations significantly. 
 
 
-
+Let'smake a much larger test case than before, by adding white noise to a known ground-truth. 
 
 ```ipython
-rng = np.random.default_rng(24) ## mph, the avg cruising airspeed velocity of an unladen (european) swallow
+rng = np.random.default_rng(24) # (1)! 
 y_src = rng.random(1000)
 y_true = y_src>0.7
 
 y_pred = y_src + 0.05*rng.normal(size=1000)
 ```
 
+1. Did you know? 24mph is the cruising airspeed velocity of an unladen (european) swallow
 
 ```ipython
 from sklearn.metrics import average_precision_score, matthews_corrcoef
@@ -51,10 +52,20 @@ Say you wish to find the expected value of the MCC score over all thresholds:
     1.36 s ± 576 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
     176 μs ± 10.9 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
 
+!!! tip
+
+    This is one of the key features of `contingency`!
+
+    If you have many individual datasets or runs, and you want to compare cross-threshold metrics (like APS) over many experiments, needing over a second per-run can add up quickly!
+    This is a common problem in feature engineering and model selection pipelines.
+
+    For an example, see the [MENDR benchmark](https://github.com/usnistgov/mendr), where tens of thousands of individual prediction arrays need to be systematically compared via APS and expected MCC.
+    Using the mean of many `matthews_corrcoef` calls would take a very long time, if not for the optimizations made by `contingency`!
 
 ## Subsampling Approximation
 
-The limit to this amortization comes from your RAM: the outer-product matrix can get huge. 
+The limit to this amortization comes from your RAM: the outer-product matrix we use to vectorize contingency counting can get _huge_. 
+
 To mitigate this, `Contingent.from_scalar` has a `subsamples` option, wich allows you to approximate the threshold values with an interpolated subset, distributed according to the originals. 
 
 With only a few subsamples, the score curves quickly converge to their "true" values. 

@@ -4,23 +4,20 @@ icon: lucide/house
 
 # Contingency Documentation
 
-## Welcome
-
-
 ![Image title](./images/logo.svg){ align=right }
 
-> Fast, vectorized metrology with binary contingency counts. 
+> _Fast, vectorized metrology with binary contingency counts._ 
 
 Rapidly calculate binary classifier metrics like MCC, F-Scores, and Average Precision Scores from scalar and binary predictions.
 
-For an overview of features, usage, and performance, see the [tutorial](./getting-started/02-tutorial.md). 
+For an overview of features and usage, see the [tutorial](getting-started/02-tutorial).  
+For more details about Contingency's performance and intended use-cases, see [Performance](getting-started/03-performance)
+
+!!! example "Contact the PI"
 
-## Contact the PI
+    [Rachael Sexton](https://www.nist.gov/people/rachael-t-sexton)  
+    Email: [`rachael.sexton@nist.gov`](mailto:rachael.sexton@nist.gov)  
 
-[Rachael Sexton](https://www.nist.gov/people/rachael-t-sexton)
-> [`rachael.sexton@nist.gov`](mailto:rachael.sexton@nist.gov)
-```
-NIST Engineering Laboratory
-Systems Integration Division
-Information Modeling & Testing Group 
-```
+    NIST Engineering Laboratory  
+    Systems Integration Division  
+    Information Modeling & Testing Group