Skip to content

Commit 725e7d7

Browse files
authored
Merge pull request #272 from uriahf/docs/update-readme-and-tutorial-17743907015561309458
docs: Update README and "Getting Started" Tutorial
2 parents 11edff9 + 476b565 commit 725e7d7

File tree

2 files changed

+107
-34
lines changed

2 files changed

+107
-34
lines changed

README.md

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,43 @@
77
* **Gains and Lift Charts**
88
* **Decision Curves**
99

10-
The library is designed to be easy to use, while still offering a high degree of control over the final plots.
10+
The library is designed to be easy to use, while still offering a high degree of control over the final plots. For some reproducible examples please visit the [rtichoke blog](https://uriahf.github.io/rtichoke-py/blog.html)!
11+
12+
## Installation
13+
14+
You can install `rtichoke` from PyPI:
15+
16+
```bash
17+
pip install rtichoke
18+
```
19+
20+
## Getting Started
21+
22+
To use `rtichoke`, you'll need two main inputs:
23+
24+
* `probs`: A dictionary containing your model's predicted probabilities.
25+
* `reals`: A dictionary of the true binary outcomes.
26+
27+
Here's a quick example of how to create a ROC curve for a single model:
28+
29+
```python
30+
import numpy as np
31+
import rtichoke as rk
32+
33+
# Sample data for a model. Note that the probabilities for the
34+
# positive class (1) are generally higher than for the negative class (0).
35+
probs = {'Model A': np.array([0.1, 0.9, 0.4, 0.8, 0.3, 0.7, 0.2, 0.6])}
36+
reals = {'Population': np.array([0, 1, 0, 1, 0, 1, 0, 1])}
37+
38+
39+
# Create the ROC curve
40+
fig = rk.create_roc_curve(
41+
probs=probs,
42+
reals=reals
43+
)
44+
45+
fig.show()
46+
```
1147

1248
## Key Features
1349

@@ -18,6 +54,4 @@ The library is designed to be easy to use, while still offering a high degree of
1854

1955
## Documentation
2056

21-
For a complete guide to the library, including a "Getting Started" tutorial and a full API reference, please see the **[official documentation](https://your-documentation-url.com)**.
22-
23-
*(Note: The documentation URL will need to be updated once the website is deployed.)*
57+
For a complete guide to the library, including a "Getting Started" tutorial and a full API reference, please see the **[official documentation](https://uriahf.github.io/rtichoke-py/)**.

docs/tutorials/getting_started.qmd

Lines changed: 69 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
---
2-
title: "Getting Started with Rtichoke"
2+
title: "Getting Started with rtichoke"
33
---
44

5-
This tutorial provides a basic introduction to the `rtichoke` library. We'll walk through the process of preparing data, creating a decision curve, and visualizing the results.
5+
This tutorial provides an introduction to the `rtichoke` library, showing how to visualize model performance for different scenarios.
66

77
## 1. Import Libraries
88

@@ -13,50 +13,89 @@ import numpy as np
1313
import rtichoke as rk
1414
```
1515

16-
## 2. Prepare Your Data
16+
## 2. Understanding the Inputs
1717

18-
`rtichoke` expects data in a specific format. You'll need two main components:
18+
`rtichoke` expects two main inputs for creating performance curves:
1919

20-
* **Probabilities (`probs`)**: A dictionary where keys are model names and values are NumPy arrays of predicted probabilities.
21-
* **Real Outcomes (`reals`)**: A NumPy array containing the true binary outcomes (0 or 1).
20+
* **`probs` (Probabilities)**: A dictionary where keys are model or population names and values are lists or NumPy arrays of predicted probabilities.
21+
* **`reals` (Outcomes)**: A dictionary where keys are population names and values are lists or NumPy arrays of the true binary outcomes (0 or 1).
2222

23-
Let's create some sample data for two different models:
23+
Let's look at the three main use cases.
24+
25+
### Use Case 1: Single Model
26+
27+
This is the simplest case, where you want to evaluate the performance of a single predictive model.
28+
29+
For this, you provide `probs` with a single entry for your model and `reals` with a single entry for the corresponding outcomes.
2430

2531
```python
26-
# Sample data from the dcurves_example.py script
27-
probs_dict = {
28-
"Marker": np.array([
29-
0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5,
30-
0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9
31-
]),
32-
"Marker2": np.array([
33-
0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5,
34-
0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9
35-
])
36-
}
37-
reals = np.array([
38-
1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1
39-
])
32+
# Sample data for a model. Note that the probabilities for the
33+
# positive class (1) are generally higher than for the negative class (0).
34+
probs_single = {"Model A": np.array([0.1, 0.9, 0.4, 0.8, 0.3, 0.7, 0.2, 0.6])}
35+
reals_single = {"Population": np.array([0, 1, 0, 1, 0, 1, 0, 1])}
36+
37+
# Create a ROC curve
38+
fig = rk.create_roc_curve(
39+
probs=probs_single,
40+
reals=reals_single,
41+
)
42+
43+
# In an interactive environment (like a Jupyter notebook),
44+
# this will display the plot.
45+
fig.show()
4046
```
4147

42-
## 3. Create a Decision Curve
48+
### Use Case 2: Models Comparison
49+
50+
Often, you want to compare the performance of several different models on the *same* population.
4351

44-
Now that we have our data, we can create a decision curve. This is a simple one-liner with `rtichoke`:
52+
For this, you provide `probs` with an entry for each model you want to compare. `reals` will still have a single entry, since the outcome data is the same for all models.
4553

4654
```python
47-
fig = rk.create_decision_curve(
48-
probs=probs_dict,
49-
reals=reals,
55+
# Sample data for two models. Model A is better at separating the classes.
56+
probs_comparison = {
57+
"Model A": np.array([0.1, 0.9, 0.2, 0.8, 0.3, 0.7]),
58+
"Model B": np.array([0.2, 0.8, 0.3, 0.7, 0.4, 0.6]),
59+
"Random Guess": np.array([0.5, 0.5, 0.5, 0.5, 0.5, 0.5])
60+
}
61+
reals_comparison = {"Population": np.array([0, 1, 0, 1, 0, 1])}
62+
63+
64+
# Create a precision-recall curve to compare the models
65+
fig = rk.create_precision_recall_curve(
66+
probs=probs_comparison,
67+
reals=reals_comparison,
5068
)
69+
70+
fig.show()
5171
```
5272

53-
## 4. Show the Plot
73+
### Use Case 3: Several Populations
5474

55-
Finally, let's display the plot. Since `rtichoke` uses Plotly under the hood, you can show the figure just like any other Plotly object.
75+
This is useful when you want to evaluate a single model's performance across different populations. A common example is comparing performance on a training set versus a testing set to check for overfitting.
76+
77+
For this, you provide `probs` with an entry for each population and `reals` with a corresponding entry for each population's outcomes.
5678

5779
```python
58-
# To display the plot in an interactive environment (like a Jupyter notebook)
80+
# Sample data for a train and test set.
81+
# The model performs slightly better on the train set.
82+
probs_populations = {
83+
"Train": np.array([0.1, 0.9, 0.2, 0.8, 0.3, 0.7]),
84+
"Test": np.array([0.2, 0.8, 0.3, 0.7, 0.4, 0.6])
85+
}
86+
reals_populations = {
87+
"Train": np.array([0, 1, 0, 1, 0, 1]),
88+
"Test": np.array([0, 1, 0, 1, 0, 0]) # Note one outcome is different
89+
}
90+
91+
# Create a calibration curve to compare the model's performance
92+
# on the two populations.
93+
fig = rk.create_calibration_curve(
94+
probs=probs_populations,
95+
reals=reals_populations,
96+
)
97+
5998
fig.show()
6099
```
61100

62-
And that's it! You've created your first decision curve with `rtichoke`. From here, you can explore the other curve types and options that the library has to offer.
101+
And that's it! You've now seen how to create three of the most common evaluation plots with `rtichoke`. From here, you can explore the other curve types and options that the library has to offer in the [API Reference](../reference/index.qmd).

0 commit comments

Comments
 (0)