Skip to content

Commit 231b5a2

Browse files
authored
Merge pull request #178 from uriahf/100-create-prepare_performance_data-and-prepare_performance_data_times-functions
100 create prepare performance data and prepare performance data times functions
2 parents bb145a1 + 74c6ec2 commit 231b5a2

File tree

3 files changed

+344
-0
lines changed

3 files changed

+344
-0
lines changed

docs/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
/.quarto/
22

33
**/*.quarto_ipynb
4+
_sidebar.yml

docs/_quarto.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ website:
1010
left:
1111
- href: reference/
1212
text: Reference
13+
- href: before_we_validate.qmd
14+
text: Before We Validate
1315

1416
quartodoc:
1517
# the name used to import the package you want to create reference docs for

docs/before_we_validate.qmd

Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
---
2+
title: "Before we Validate Performance"
3+
author: "Uriah Finkel"
4+
format:
5+
html:
6+
echo: false
7+
mermaid-format: svg
8+
---
9+
10+
Ideally we would like to keep Performance Validation as agnostic as possible. However, the structure of the validation set (`probs`, `reals` and `times`) implies the nature of the related assumptions and the required use case.
11+
12+
So before we validate performance, let us consider the underlying process.
13+
14+
✍️ The User Inputs\
15+
🪛 Internal Function
16+
17+
# ✍️ Declare reference groups
18+
19+
The dimentions of the `probs` and the `real` dictionaries imply the nature of the use case:
20+
21+
TODO: copy from rtichoke r README.
22+
23+
##### One Model, One Population:
24+
25+
- Just one reference group: "model".
26+
27+
##### Several Models, One Population:
28+
29+
Compare between different candidate models. - Each model stand as a reference groups such as "thin" model, or a "full" model.
30+
31+
##### Several Models, Several Populations
32+
33+
Compare performance over different sub-populations. - Internal Validation: "test", "val" and "train". - External Validation: "Framingham", "Australia". - Fairness: "Male", "Female".
34+
35+
# ✍️ Declare how to stratify predictions ✂️
36+
37+
The `stratified_by` argument is designed for the user to choose how to stratify predictions for decision-making, each method implies different problem:
38+
39+
::: {.panel-tabset}
40+
41+
## Probability Threshold
42+
43+
::: {.panel-tabset}
44+
45+
By choosing Probability Threshold as a cutoff the implied assumption is that you are concerned with individual harm or benefit.
46+
47+
### Baseline Strategy: Treat None
48+
49+
```{mermaid}
50+
51+
graph LR
52+
subgraph trt[Treatment Decision]
53+
linkStyle default stroke:#000
54+
A("😷") -->|"Treatment 💊"|B("<B>Predicted<br>Positive</B><br>💊<br>😷")
55+
A -->|"No Treatment"|C("<B>Predicted<br>Negative</B><br>😷")
56+
end
57+
58+
subgraph ut[Utility of the Decision]
59+
subgraph pred[Prediction Model]
60+
B -->|"Disease 🤢"| D["<B>TP</B><br>💊<br>🤢"]
61+
B -->|"No Disease 🤨"| E["<B>FP</B><br>💊<br>🤨"]
62+
C -->|"Disease 🤢"| F["<B>FN</B><br>🤢"]
63+
C -->|"No Disease 🤨"| G["<B>TN</B><br>🤨"]
64+
end
65+
subgraph baselinestrategy[Baseline Strategy: Treat None]
66+
Dnone["<B>FN</B><br>🤢"]
67+
Enone["<B>TN</B><br>🤨"]
68+
Fnone["<B>FN</B><br>🤢"]
69+
Gnone["<B>TN</B><br>🤨"]
70+
71+
D---Dnone
72+
E---Enone
73+
F---Fnone
74+
G---Gnone
75+
end
76+
subgraph nb[Net Benefit]
77+
Dnb[1]
78+
Enb["pt / (1-pt)"]
79+
Fnb[0]
80+
Gnb[0]
81+
Dnone---Dnb
82+
Enone---Enb
83+
Fnone---Fnb
84+
Gnone---Gnb
85+
end
86+
end
87+
88+
89+
90+
style A fill:#E8F4FF, stroke:black,color:black
91+
style B fill:#E8F4FF, stroke:black,color:black
92+
style C fill:#E8F4FF, stroke:black,color:black
93+
style D fill:#C0FFC0,stroke:black,color:black
94+
style Dnone fill:#FFCCE0,stroke:black,color:black
95+
style Dnb fill: #C0FFC0,stroke:black,color:black
96+
style E fill: #FFCCE0,stroke:black,color:black
97+
style Enone fill: #C0FFC0,stroke:black,color:black
98+
style Enb fill: #FFCCE0,stroke:black,color:black
99+
style F fill:#FFCCE0,stroke:black,color:black
100+
style Fnone fill: #FFCCE0,stroke:black,color:black
101+
style Fnb fill: #E8F4FF,stroke:black,color:black
102+
style G fill: #C0FFC0,stroke:black,color:black
103+
style Gnone fill: #C0FFC0,stroke:black,color:black
104+
style Gnb fill: #E8F4FF,stroke:black,color:black
105+
style nb fill: #E8F4FF,stroke:black,color:black
106+
style pred fill: #E8F4FF,stroke:black,color:black
107+
style baselinestrategy fill: #E8F4FF,stroke:black,color:black
108+
109+
classDef subgraphStyle fill:#FAF6EC,stroke:#333,stroke-width:1px
110+
class trt,ut subgraphStyle
111+
112+
```
113+
114+
### Baseline Strategy: Treat All
115+
116+
```{mermaid}
117+
118+
graph LR
119+
subgraph trt[Treatment Decision]
120+
linkStyle default stroke:#000
121+
A("😷") -->|"Treatment 💊"|B("<B>Predicted<br>Positive</B><br>💊<br>😷")
122+
A -->|"No Treatment"|C("<B>Predicted<br>Negative</B><br>😷")
123+
end
124+
125+
subgraph ut[Utility of the Decision]
126+
subgraph pred[Prediction Model]
127+
B -->|"Disease 🤢"| D["<B>TP</B><br>💊<br>🤢"]
128+
B -->|"No Disease 🤨"| E["<B>FP</B><br>💊<br>🤨"]
129+
C -->|"Disease 🤢"| F["<B>FN</B><br>🤢"]
130+
C -->|"No Disease 🤨"| G["<B>TN</B><br>🤨"]
131+
end
132+
subgraph baselinestrategy[Baseline Strategy: Treat All]
133+
Dall["<B>TP</B><br>💊<br>🤢"]
134+
Eall["<B>FP</B><br>💊<br>🤨"]
135+
Fall["<B>TP</B><br>💊<br>🤢"]
136+
Gall["<B>FP</B><br>💊<br>🤨"]
137+
138+
D---Dall
139+
E---Eall
140+
F---Fall
141+
G---Gall
142+
end
143+
subgraph nb[Net Benefit]
144+
Dnb[0]
145+
Enb[0]
146+
Fnb["(1-pt) / pt"]
147+
Gnb["1"]
148+
Dall---Dnb
149+
Eall---Enb
150+
Fall---Fnb
151+
Gall---Gnb
152+
end
153+
end
154+
155+
156+
157+
style A fill:#E8F4FF, stroke:black,color:black
158+
style B fill:#E8F4FF, stroke:black,color:black
159+
style C fill:#E8F4FF, stroke:black,color:black
160+
style D fill:#C0FFC0,stroke:black,color:black
161+
style Dall fill:#C0FFC0,stroke:black,color:black
162+
style Dnb fill:#E8F4FF,stroke:black,color:black
163+
style E fill:#FFCCE0,stroke:black,color:black
164+
style Eall fill:#FFCCE0,stroke:black,color:black
165+
style Enb fill:#E8F4FF,stroke:black,color:black
166+
style F fill:#FFCCE0,stroke:black,color:black
167+
style Fall fill:#C0FFC0,stroke:black,color:black
168+
style Fnb fill:#FFCCE0,stroke:black,color:black
169+
style G fill:#C0FFC0,stroke:black,color:black
170+
style Gall fill:#FFCCE0,stroke:black,color:black
171+
style Gnb fill:#C0FFC0,stroke:black,color:black
172+
style nb fill: #E8F4FF,stroke:black,color:black
173+
style pred fill: #E8F4FF,stroke:black,color:black
174+
style baselinestrategy fill: #E8F4FF,stroke:black,color:black
175+
176+
classDef subgraphStyle fill:#FAF6EC,stroke:#333,stroke-width:1px
177+
class trt,ut subgraphStyle
178+
179+
```
180+
181+
*Regardless* of ranking each prediction is categorised to a bin: 0.32 -\> `[0.3, 0.4)`.
182+
183+
1. Categorise Absolute Risk: 0.32 -\> `[0.3, 0.4)`
184+
185+
References: Pauker SG, Kassirer JP. Therapeutic decision making: a cost-benefit analysis. N Engl J Med. 1975;293(5):229-234. doi:10.1056/NEJM197507312930505
186+
187+
:::
188+
189+
## PPCR
190+
191+
![](line_ppcr_04.svg)
192+
193+
```{mermaid}
194+
195+
graph LR
196+
subgraph trt[Treatment Allocation Decision]
197+
linkStyle default stroke:#000
198+
A("😷<br>😷<br>😷<br>😷<br>😷<br>😷<br>😷<br>😷<br>😷<br>😷") -->|"Treatment 💊💊💊💊"|B("<B>Σ Predicted<br>Positives</B><br>💊💊💊💊<br>😷😷😷😷")
199+
A -->|"No Treatment"|C("<B>Σ Predicted<br>Negatives</B><br>😷😷😷😷😷😷")
200+
end
201+
202+
subgraph ut[Utility of the Decision]
203+
B -->|"Disease 🤢🤢🤢"| D["<B>Σ TP</B><br>💊💊💊<br>🤢🤢🤢"]
204+
B -->|"No Disease 🤨"| E["<B>Σ FP</B><br>💊<br>🤨"]
205+
C -->|"Disease 🤢"| F["<B>Σ FN</B><br>🤢"]
206+
C -->|"No Disease 🤨🤨🤨🤨🤨"| G["<B>Σ TN</B><br>🤨🤨🤨🤨🤨"]
207+
end
208+
209+
210+
211+
style A fill:#E8F4FF, stroke:black,color:black
212+
style B fill:#E8F4FF, stroke:black,color:black
213+
style C fill:#E8F4FF, stroke:black,color:black
214+
style D fill:#C0FFC0,stroke:black,color:black
215+
style E fill:#FFCCE0,stroke:black,color:black
216+
style F fill:#FFCCE0,stroke:black,color:black
217+
style G fill:#C0FFC0,stroke:black,color:black
218+
219+
classDef subgraphStyle fill:#FAF6EC,stroke:#333,stroke-width:1px
220+
class trt,ut subgraphStyle
221+
222+
```
223+
224+
By choosing PPCR as a cutoff the implied assumption is that you are concerned with resource constraint and assume no individual treatment harm.
225+
226+
*Regarding* the ranking each prediction is categorised to a bin: if the absolute probability 0.32 is the 18th highest predictions out of 100, it will be categorised to the second decile -\> `0.18`.
227+
228+
1. Calculate Risk-Quantile from Absolute Risk: 0.32 -\> `0.18`
229+
230+
References: https://en.wikipedia.org/wiki/Precision_and_recall
231+
232+
:::
233+
234+
# ✍️ Declare Fixed Time Horizons 🌅 (📅🤬)
235+
236+
The `fixed_time_horizons` argument is designed for the user to choose the set of time horizons to follow.
237+
238+
Different followups contain different distributions of observed outcomes: Declare fixed time horizons for the prediction model, such as \[5, 10\] years of prediction for CVD evet.
239+
240+
## 🪛 Update Administrative Censorng
241+
242+
For cases with observed time-to-event is shorter than the prediction time horizon, the outcomes might change:
243+
244+
- `Real Positives` 🤢 should be considered as `Real Negatives` 🤨, the outcome of interest did not happen yet.
245+
246+
- Always included and Encoded as 0.
247+
248+
- `Real Neagtives` 🤨 should be considered as `Real Censored` 🤬, the event of interest could have happened in the gap between the observed time and the fixed time horizon.
249+
250+
- If adjusted: encoded as 0.
251+
252+
- If excluded: counted with crude estimate.
253+
254+
```{python}
255+
256+
import numpy as np
257+
258+
times = np.array([24.1, 9.7, 49.9, 18.6, 34.8, 14.2, 39.2, 46.0, 31.5, 4.3])
259+
reals = np.array([1, 1, 1, 1, 0, 2, 1, 2, 0, 1])
260+
time_horizons = [10, 20, 30, 40, 50]
261+
262+
# Icons
263+
def get_icon(outcome, t, h):
264+
if outcome == 0:
265+
return "🤬" if t < h else "🤨"
266+
elif outcome == 1:
267+
return "🤢"
268+
elif outcome == 2:
269+
return "💀"
270+
271+
# Displayed time
272+
def get_time(outcome, t, h):
273+
if outcome == 0:
274+
return t if t < h else h
275+
else:
276+
return t
277+
278+
# Final output
279+
final_data = []
280+
281+
for i in range(len(times)):
282+
id_ = i + 1
283+
t = times[i]
284+
r = reals[i]
285+
286+
for h in time_horizons:
287+
outcome = r if t <= h else 0 # override outcome after horizon
288+
final_data.append({
289+
"id": id_,
290+
"time_horizon": h,
291+
"time": get_time(outcome, t, h),
292+
"real": get_icon(outcome, t, h)
293+
})
294+
295+
ojs_define(data = final_data)
296+
297+
```
298+
299+
```{ojs}
300+
301+
filteredData = data.filter((d) => d.time_horizon == timeHorizon)
302+
303+
viewof timeHorizon = Inputs.range([10, 50], {
304+
step: 10,
305+
value: 50,
306+
label: "Time Horizon"
307+
})
308+
309+
Plot.plot({
310+
x: {
311+
domain: [0, 50]
312+
},
313+
y: {
314+
domain: [0, 11],
315+
axis: false
316+
},
317+
marks: [
318+
Plot.ruleX([timeHorizon], {
319+
stroke: "#D9E8A3",
320+
strokeWidth: 6,
321+
strokeDasharray: "5,5",
322+
y1: 0,
323+
y2: 10 // Should match the y-domain max
324+
}),
325+
Plot.ruleY(filteredData, {
326+
x: "time",
327+
y: "id",
328+
strokeWidth: 1.5
329+
}),
330+
Plot.text(filteredData, {
331+
x: "time",
332+
y: "id",
333+
text: "real",
334+
tip: true,
335+
fontSize: 30
336+
})
337+
]
338+
})
339+
340+
```
341+

0 commit comments

Comments
 (0)