-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
This relates to the unit test example where you create a df and then save as CSV only to reload and check the columns.
I think what is included is a good introduction. But it means the unit test is linked to both the user's file system and the pandas.read_csv function. There's a way to run this test without either, but it is a bit more advanced.
Have you used monkey patching before?
There's a nice way to do this in pytest docs: https://docs.pytest.org/en/stable/how-to/monkeypatch.html
An example:
import pandas as pd
import pytest
from waitingtimes.patient_analysis import import_patient_data
def test_import_success(monkeypatch):
# Create dataframe to use in testing
expected_cols = [
"PATIENT_ID",
"ARRIVAL_DATE", "ARRIVAL_TIME",
"SERVICE_DATE", "SERVICE_TIME",
]
df_in = pd.DataFrame(
[["p1", "2024-01-01", "08:00", "2024-01-01", "09:00"]],
columns=expected_cols,
)
# as don't really want to check pandas ability to read a CSV we just return the
# dataframe we created in memory. this also removes the need to write out to hard disk
def mock_read_csv(path):
return df_in
# temp replace the read_csv function of pd
monkeypatch.setattr(pd, "read_csv", mock_read_csv)
# the path is now irrelevant
result = import_patient_data("does_not_matter.csv")
pd.testing.assert_frame_equal(result, df_in)Another thing you could do with mock_read_csv is check that it has been correctly passed a Path. thinking something like:
def mock_read_csv(path):
assert isinstance(path, Path)
return df_inThis is more complicated, but I think best practice to avoid IO. Maybe a drop down with extra info or an advanced section?