should we include a "smoke test" example to complement the functional tests?

Did you come across smoke tests in the Turing way?  https://book.the-turing-way.org/reproducible-research/testing/testing-smoketest/

They are typically the first test that you run and are usually very fast.  They test that your code can "run end to end" and not much more i.e they don't care if results are correct - just that it runs.  If they fail then all other tests are cancelled.

For analysis code I think this is **very similar** to your functional tests, but it would purposely use a very small dummy dataset for speed to make sure it is fast and perform the most basic asserts at the end to just test it ran.  If that fails then the remaining tests would be abandoned.

I asked GPT5.2 to convert your functional example to a smoke test where all is does is check that its produced the final `dict`.  Note I think you could also consider monkey patching (#16 ) this so that there is no real write and read from the users directory.  See below.

We need a way to run this so that if it fails then the remaining tests are ignored.  If we assume your smoke test is in `test_smoke.py` think this approach would work:

```bash
pytest test_smoke.py && pytest --ignore=test_smoke.py
```
```python
import pandas as pd

# from waitingtimes.patient_analysis import (
#     import_patient_data, calculate_wait_times, summary_stats
# )


def test_workflow_smoke_end_product(tmp_path):
    """Smoke: end-to-end workflow produces the expected final output shape."""
    # Create test data
    test_data = pd.DataFrame(
        {
            "PATIENT_ID": ["p1", "p2", "p3"],
            "ARRIVAL_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
            "ARRIVAL_TIME": ["0800", "0930", "1015"],
            "SERVICE_DATE": ["2024-01-01", "2024-01-01", "2024-01-02"],
            "SERVICE_TIME": ["0830", "1000", "1045"],
        }
    )

    # Write test CSV
    csv_path = tmp_path / "patients.csv"
    test_data.to_csv(csv_path, index=False)

    # Run complete workflow
    df = import_patient_data(csv_path)
    df = calculate_wait_times(df)
    stats = summary_stats(df["waittime"])

    # Final “end product” check only
    # TM note I think you could also just use `assert stats is not None` for a smoke test.
    assert isinstance(stats, dict)
    assert {"mean", "std_dev", "ci_lower", "ci_upper"}.issubset(stats)

```

Again this could just be a caveat or note?  It  useful for simulation testing imo.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

should we include a "smoke test" example to complement the functional tests? #17

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

should we include a "smoke test" example to complement the functional tests? #17

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions