From e02942e42e0ddfc57f0347f418191ee7805ed98f Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Wed, 19 Nov 2025 15:19:57 +0000
Subject: [PATCH 1/9] Working through run-herg-qc

---
 README.md                                     |  83 +-
 pcpostprocess/leak_correct.py                 |   8 +-
 pcpostprocess/scripts/run_herg_qc.py          | 757 +++++++++---------
 .../scripts/summarise_herg_export.py          |   3 -
 4 files changed, 428 insertions(+), 423 deletions(-)

diff --git a/README.md b/README.md
index 1106afd..62f4d7d 100644
--- a/README.md
+++ b/README.md
@@ -46,42 +46,71 @@ python3 -m unittest
 
 ## Usage
 
-### Running QC and post-processing
+### Classic case
 
-Quality control (QC) may be run using the criteria outlined in [Rapid Characterization of hERG Channel Kinetics I](https://doi.org/10.1016/j.bpj.2019.07.029) and [Evaluating the predictive accuracy of ion channel models using data from multiple experimental designs](https://doi.org/10.1101/2024.08.16.608289). These criteria assume the use of the `staircase` protocol for quality control, which should be the first and last protocol performed. We also assume the presence of repeats after the addition of an IKr blocker (such as dofetilide).
+- Syncropatch 384 ("DataControl 384" output)
+- A full repeat of all protocols with an IKr blocker, e.g. E-4031
+- A staircase protocol, run 4 times:
+  - Before and after all other protocols, before drug block
+  - Before and after all other protocols, after drug block
+- Additional protocols ran in between the staircases
+- Leak subtraction by...
 
-Prior to performing QC and exporting, an `export_config.py` file should be added to the root of the data directory. This file (see `example_config.py`) contains a Python `dict` (`Q2S_DC`) specifying the filenames of the protocols used for QC, and names they should be outputted with, as well as a Python `dict` (`D2S`) listing the other protocols and names to be used for their output. Additionally, the `saveID` field specifies the name of the expeirment which appears in the output file names.
 
+Quality control (QC) may be run using the criteria outlined in [Rapid Characterization of hERG Channel Kinetics I](https://doi.org/10.1016/j.bpj.2019.07.029) and [Evaluating the predictive accuracy of ion channel models using data from multiple experimental designs](https://doi.org/10.1101/2024.08.16.608289). 
+These criteria assume the use of the `staircase` protocol for quality control, which should be the first and last protocol performed. 
+We also assume the presence of repeats after the addition of an IKr blocker (such as dofetilide).
+
+Prior to performing QC and exporting, an `export_config.py` file should be added to the root of the data directory. 
+This file (see `example_config.py`) contains a Python `dict` (`Q2S_DC`) specifying the filenames of the protocols used for QC, and names they should be outputted with, as well as a Python dict (`D2S`) listing the other protocols and names to be used for their output. 
+Additionally, the `saveID` field specifies the name of the expeirment which appears in the output file names.
+
+An example input directory might contain the following subdirectories
+- `staircaseramp (2)_2kHz_15.01.07`, staircase 1 before E-4031
+- `StaircaseInStaircaseramp (2)_2kHz_15.01.51`, non-QC protocol
+- `staircaseramp (2)_2kHz_15.06.53`, staircase 2 before E-4031
+- `staircaseramp (2)_2kHz_15.11.33`, staircase 1 after E-4031
+- `StaircaseInStaircaseramp (2)_2kHz_15.12.17`, non-QC protocol
+- `staircaseramp (2)_2kHz_15.17.19`, staircase 2 after E-4031
+
+`run_herg_qc` will use the time parts of the staircase names to work out which is which.
+This assumes there are either 2 staircase runs (once before drug, once after), or 4 (as above)
+
+Example:
+```
+# Configuration for run_herg_qc
+
+# Save name for this set of data
+saveID = 'EXPERIMENT_NAME'
+
+# Name of protocols to use in QC, mapped onto names used in output
+D2S_QC = {
+    'staircaseramp (2)_2kHz': 'staircaseramp'
+}
+
+# Name of additional protocols to export, but not use in QC (again as a dict)
+D2S = {
+    'StaircaseInStaircaseramp (2)_2kHz': 'sis',
+}
+```
+
+Next, we call `run_herg_qc`:
 ```sh
-$ pcpostprocess run_herg_qc --help
+pcpostprocess run_herg_qc test_data/13112023_MW2_FF -w A01 A02 A03 -o output --output_traces
 
-usage: pcpostprocess run_herg_qc [-h] [-c NO_CPUS]
-                      [--output_dir OUTPUT_DIR] [-w WELLS [WELLS ...]]
-                      [--protocols PROTOCOLS [PROTOCOLS ...]]
-                      [--reversal_spread_threshold REVERSAL_SPREAD_THRESHOLD] [--export_failed]
-                      [--selection_file SELECTION_FILE] [--subtracted_only]
-                      [--figsize FIGSIZE FIGSIZE]
-                      [--debug] [--log_level LOG_LEVEL] [--Erev EREV]
-                      data_directory
+```
 
-positional arguments:
-  data_directory
 
-options:
-  -h, --help            show this help message and exit
-  -c NO_CPUS, --no_cpus NO_CPUS      Number of workers to spawn in the multiprocessing pool (default: 1)
-  --output_dir OUTPUT_DIR      path where the output will be saved
-  -w WELLS [WELLS ...], --wells WELLS [WELLS ...]   wells to include (default: all)
-  --protocols PROTOCOLS [PROTOCOLS ...]  protocols to include (default: all)
-  --reversal_spread_threshold REVERSAL_SPREAD_THRESHOLD       The maximum spread in reversal potential (across sweeps) allowed for QC
-  --export_failed                Flag specifying whether to produce full output for those wells failing QC (default: false)
-  --selection_file SELECTION_FILE      File listing wells to be included
-  --figsize FIGSIZE FIGSIZE
-  --debug
-  --log_level LOG_LEVEL
-  --Erev EREV           The reversal potential during the experiment
+
+```sh
+$ pcpostprocess run_herg_qc --help
 ```
 
+
+
+
+
+
 ### Exporting Summary
 
 The `summarise_herg_export` command produces additionally output after `run_herg_qc` has been run.
diff --git a/pcpostprocess/leak_correct.py b/pcpostprocess/leak_correct.py
index c423883..7871cce 100644
--- a/pcpostprocess/leak_correct.py
+++ b/pcpostprocess/leak_correct.py
@@ -104,26 +104,26 @@ def fit_linear_leak(current, voltage, times, ramp_start_index, ramp_end_index,
         time_range = (0, times.max() / 5)
 
         #  Current vs time
-        ax1.set_title(r'\textbf{a}', loc='left')
+        ax1.set_title('a', loc='left')
         ax1.set_xlabel(r'$t$ (ms)')
         ax1.set_ylabel(r'$I_\mathrm{obs}$ (pA)')
         ax1.set_xticklabels([])
         ax1.set_xlim(*time_range)
 
         # Voltage vs time
-        ax2.set_title(r'\textbf{b}', loc='left')
+        ax2.set_title('b', loc='left')
         ax2.set_xlabel(r'$t$ (ms)')
         ax2.set_ylabel(r'$V_\mathrm{cmd}$ (mV)')
         ax2.set_xlim(*time_range)
 
         # Current vs voltage
-        ax3.set_title(r'\textbf{c}', loc='left')
+        ax3.set_title('c', loc='left')
         ax3.set_xlabel(r'$V_\mathrm{cmd}$ (mV)')
         ax3.set_ylabel(r'$I_\mathrm{obs}$ (pA)')
 
         ax4.set_xlabel(r'$t$ (ms)')
         ax4.set_ylabel(r'current (pA)')
-        ax4.set_title(r'\textbf{d}', loc='left')
+        ax4.set_title('d', loc='left')
 
         start_t = times[ramp_start_index]
         end_t = times[ramp_end_index]
diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index 18c87ae..0d4df65 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -33,11 +33,10 @@ def starmap(n, func, iterable):
     Like ``multiprocessing.Pool.starmap``, but does not use subprocesses when
     n=1.
     """
-    if n == 1:
-        return [func(*args) for args in iterable]
-    else:  # pragma: no cover
+    if n > 1:  # pragma: no cover
         with multiprocessing.Pool(n, maxtasksperchild=1) as pool:
             return pool.starmap(func, iterable)
+    return [func(*args) for args in iterable]
 
 
 def run_from_command_line():  # pragma: no cover
@@ -113,7 +112,19 @@ def run(data_path, output_path, qc_map, wells=None,
         reversal_potential=-90, reversal_spread_threshold=10,
         max_processes=1, figure_size=None, save_id=None):
     """
-    Imports traces and runs QC.
+    Imports traces and runs QC+.
+
+    Makes the following assumptions:
+
+    This proceeds with the following steps:
+
+    1. All wells (or those selected with ``wells``) are run through "plain" QC,
+       for every protocol in ``qc_map``.
+
+
+
+
+
 
     @param data_path The path to read data from
     @param output_path The path to write output to
@@ -134,7 +145,7 @@ def run(data_path, output_path, qc_map, wells=None,
     @param figure_size An optional tuple specifying the size of figures to
            create
 
-    @param save_id TODO
+    @param save_id Used in some outputs, e.g. as part of CSV names
 
     """
     # TODO reversal_spread_threshold should be specified the same way as all
@@ -146,9 +157,6 @@ def run(data_path, output_path, qc_map, wells=None,
     # TODO Remove protocol selection here: this is done via the export file!
     #      Only protocols listed there are accepted
 
-    # TODO: Find some way around setting this?
-    matplotlib.use('Agg')
-
     # Select wells to use
     all_wells = [row + str(i).zfill(2) for row in string.ascii_uppercase[:16]
                  for i in range(1, 25)]
@@ -178,11 +186,14 @@ def run(data_path, output_path, qc_map, wells=None,
         r'^([a-z|A-Z|_|0-9| |\-|\(|\)]+)_([0-9][0-9]\.[0-9][0-9]\.[0-9][0-9])$'
     protocols_regex = re.compile(protocols_regex)
 
-    # Gather protocol directories to use in a dictionary k:v, where v is a list
-    # of times so that k_v[i] is a directory name
-    # TODO: Replace this by looping over qc_map and write_map
+    # Gather protocol directories to use in a dictionary
+    # { protocol_name: [time1, time2, time3, ...] }
+    # such that protocol_name_time is a directory
+
+    # TODO: Replace this by looping over qc_map and write_map?
     res_dict = {}
     for dirname in os.listdir(data_path):
+        print(dirname, os.path.basename(dirname))
         dirname = os.path.basename(dirname)
         match = protocols_regex.match(dirname)
 
@@ -192,6 +203,7 @@ def run(data_path, output_path, qc_map, wells=None,
         protocol_name = match.group(1)
 
         if not (protocol_name in qc_map or protocol_name in write_map):
+            print(f'Skipping {protocol_name}')
             continue
 
         time = match.group(2)
@@ -201,16 +213,29 @@ def run(data_path, output_path, qc_map, wells=None,
 
         res_dict[protocol_name].append(time)
 
-    readnames, savenames, times_list = [], [], []
+    def pront(*args):
+        print('********')
+        for arg in args:
+            print(arg)
+
+    pront(res_dict)
+    # At this point, despite its name, res_dict is not a dictionary of results,
+    # but a map of QC protocol names onto lists of times (see comment above)
+
+    #
+    # Prepare arguments to call `run_qc_for_protocol`
+    #
 
     combined_dict = qc_map | write_map
 
     # Select QC protocols and times
+    readnames, savenames, times_list = [], [], []
     for protocol in res_dict:
         if protocol not in qc_map:
             continue
 
         times = sorted(res_dict[protocol])
+        pront(times)
 
         savename = qc_map[protocol]
 
@@ -231,6 +256,18 @@ def run(data_path, output_path, qc_map, wells=None,
             times_list.append([times[1], times[3]])
             readnames.append(protocol)
 
+        else:
+            raise ValueError('Expecting 2 or 4 repeats of the QC protocol')
+
+    # For two repeats, we now have
+    #   savenames: short user name, one per QC protocol
+    #   times_list: the time part of a dirname, [before, after]
+    # For four repeats
+    #   savenames: short user name, repeat has _2 added on
+    #   times_list: [before1, after1], [before2, after2]
+
+    pront(readnames, savenames, times_list)
+
     if not readnames:
         raise ValueError('No compatible protocols specified.')
 
@@ -238,108 +275,128 @@ def run(data_path, output_path, qc_map, wells=None,
     n = min(max_processes, m)
     args = zip(readnames, savenames, times_list, [output_path] * m,
                [data_path] * m, [wells] * m, [write_traces] * m, [save_id] * m)
-    well_selections, qc_dfs = list(zip(*starmap(n, run_qc_for_protocol, args)))
+    well_selections, qc_dfs = zip(*starmap(n, run_qc_for_protocol, args))
+
+    pront(well_selections)
+    pront(qc_dfs)
+
+    #
+    # Assuming a single QC protocol. At this point, we have
+    #
+    #  well_selections: a tuple ``(s1, s2)``, where ``s1`` lists the cells
+    #                   passing QC on staircase 1, and ``s2`` lists those
+    #                   passing QC on staircase 2.
+    #  qc_dfs: a tuple ``(d1, d2)`` where ``d1`` is a dataframe of staircase 1
+    #          results. It doesn't need to be a dataframe, really. Just a 2d
+    #          matrix of ``n_cells`` rows, and ``n_crit + 2`` QC criteria.
+    #          in addition to QC criteria, each row starts with the well code
+    #          and ends with the shortened protocol name. The QC crit cells
+    #          contain True of False, reasons for failing are not included.
+    #
 
     qc_df = pd.concat(qc_dfs, ignore_index=True)
+    pront(qc_df)
 
-    # Do QC which requires both repeats
-    # qc3.bookend check very first and very last staircases are similar
-    protocol, savename = list(qc_map.items())[0]
-    times = sorted(res_dict[protocol])
-    if len(times) == 4:
-        qc3_bookend_dict = qc3_bookend(
-            protocol, savename, times, wells, output_path, data_path,
-            figure_size, save_id)
-    else:
-        qc3_bookend_dict = {well: True for well in qc_df.well.unique()}
+    #
+    # At this point, ``qc_df`` is a single dataframe containing the information
+    # from both qc_dfs. The protocol is indicated with the shortened name,
+    # where for the second run it has _2 appended
+    #
 
-    qc_df['qc3.bookend'] = [qc3_bookend_dict[well] for well in qc_df.well]
+    # Combine QC protocls into overall_selection
+    selection = [set(x) for x in well_selections]
+    selection = selection[0].intersection(*selection[1:])
 
-    #  qc_df will be updated and saved again, but it's useful to save them here for debugging
-    # Write qc_df to file
-    qc_df.to_csv(os.path.join(output_path, 'QC-%s.csv' % save_id))
+    # Store "plain QC" selections in "selected" files
+    fname = os.path.join(output_path, f'selected-{save_id}.txt')
+    with open(fname, 'w') as f:
+        f.write('\n'.join(selection))
+    for partial, protocol in zip(well_selections, list(savenames)):
+        fname = os.path.join(output_path, f'selected-{save_id}-{protocol}.txt')
+        with open(fname, 'w') as f:
+            f.write('\n'.join(partial))
 
-    # Write data to JSON file
-    qc_df.to_json(os.path.join(output_path, 'QC-%s.json' % save_id),
-                  orient='records')
+    #
+    # Now go over _all_ protocols, including the QC protocols (AGAIN!), and
+    # call extract_protocol() on them
+    #
+    if write_traces:
 
-    # Overwrite old files
-    for protocol in list(qc_map.values()):
-        fname = os.path.join(output_path, 'selected-%s-%s.txt' % (save_id, protocol))
-        with open(fname, 'w') as fout:
-            pass
-
-    overall_selection = []
-    for well in qc_df.well.unique():
-        failed = False
-        for well_selection, protocol in zip(well_selections, list(savenames)):
-
-            logging.debug(f"{well_selection} selected from protocol {protocol}")
-            fname = os.path.join(output_path, 'selected-%s-%s.txt' %
-                                 (save_id, protocol))
-            if well not in well_selection:
-                failed = True
-            else:
-                with open(fname, 'a') as fout:
-                    fout.write(well)
-                    fout.write('\n')
-
-        # well in every selection
-        if not failed:
-            overall_selection.append(well)
-
-    selectedfile = os.path.join(output_path, 'selected-%s.txt' % save_id)
-    with open(selectedfile, 'w') as fout:
-        for well in overall_selection:
-            fout.write(well)
-            fout.write('\n')
+        pront(savenames, readnames, times_list)
 
-    # Export all protocols
-    savenames, readnames, times_list = [], [], []
-    for protocol in res_dict:
+        # Export all protocols
+        savenames, readnames, times_list = [], [], []
+        for protocol in res_dict:
 
-        # Sort into chronological order
-        times = sorted(res_dict[protocol])
-        savename = combined_dict[protocol]
+            # Sort into chronological order
+            times = sorted(res_dict[protocol])
+            savename = combined_dict[protocol]
 
-        readnames.append(protocol)
+            readnames.append(protocol)
 
-        if len(times) == 2:
-            savenames.append(savename)
-            times_list.append(times)
+            if len(times) == 2:
+                savenames.append(savename)
+                times_list.append(times)
 
-        elif len(times) == 4:
-            savenames.append(savename)
-            times_list.append(times[::2])
+            elif len(times) == 4:
+                savenames.append(savename)
+                times_list.append(times[::2])
 
-            # Make seperate savename for protocol repeat
-            savename = combined_dict[protocol] + '_2'
-            assert savename not in combined_dict.values()
-            savenames.append(savename)
-            times_list.append(times[1::2])
-            readnames.append(protocol)
+                # Make seperate savename for protocol repeat
+                savename = combined_dict[protocol] + '_2'
+                assert savename not in combined_dict.values()
+                savenames.append(savename)
+                times_list.append(times[1::2])
+                readnames.append(protocol)
 
-        # TODO Else raise error?
+            # TODO Else raise error?
 
-    wells_to_export = wells if write_failed_traces else overall_selection
 
-    logging.info(f"exporting wells {wells}")
+        pront(savenames, readnames, times_list)
 
-    m = len(readnames)
-    n = min(max_processes, m)
-    args = zip(readnames, savenames, times_list, [wells_to_export] * m,
-               [output_path] * m, [data_path] * m, [write_traces] * m,
-               [figure_size] * m, [reversal_potential] * m, [save_id] * m)
-    dfs = starmap(n, extract_protocol, args)
-
-    if dfs:
-        extract_df = pd.concat(dfs, ignore_index=True)
-        extract_df['selected'] = extract_df['well'].isin(overall_selection)
+        wells_to_export = wells if write_failed_traces else selection
+        logging.info(f'exporting wells {wells_to_export}')
+        m = len(readnames)
+        n = min(max_processes, m)
+        args = zip(readnames, savenames, times_list, [wells_to_export] * m,
+                   [output_path] * m, [data_path] * m, [figure_size] * m,
+                   [reversal_potential] * m, [save_id] * m)
+        dfs = starmap(n, extract_protocol, args)
+
+        pront(len(dfs))
+        for df in dfs:
+            pront(df)
+        sys.exit(1)
+
+        if dfs:
+            extract_df = pd.concat(dfs, ignore_index=True)
+            extract_df['selected'] = extract_df['well'].isin(selection)
+        else:
+            logging.error("Didn't export any data")
+            return
+
+        logging.info(f"extract_df: {extract_df}")
+
+    #
+    # Do QC3 on first staircase, first sweep VS second staircase, second sweep
+    # qc3.bookend check very first and very last staircases are similar
+    #
+
+    protocol, savename = list(qc_map.items())[0]
+    times = sorted(res_dict[protocol])
+    if len(times) == 4:
+        qc3_bookend_dict = qc3_bookend(
+            protocol, savename, times, wells, output_path, data_path,
+            figure_size, save_id)
     else:
-        logging.error("Didn't export any data")
-        return
+        #TODO: Better indicate that it wasn't run?
+        qc3_bookend_dict = {well: True for well in qc_df.well.unique()}
+    qc_df['qc3.bookend'] = [qc3_bookend_dict[well] for well in qc_df.well]
+    pront(qc_df)
 
-    logging.info(f"extract_df: {extract_df}")
+    #
+    #
+    #
 
     qc_erev_spread = {}
     erev_spreads = {}
@@ -437,10 +494,10 @@ def run(data_path, output_path, qc_map, wells=None,
     qc_styled_df.to_latex(os.path.join(output_path, 'qc_table.tex'))
 
     # Save in csv format
-    qc_df.to_csv(os.path.join(output_path, 'QC-%s.csv' % save_id))
+    qc_df.to_csv(os.path.join(output_path, f'QC-{save_id}.csv'))
 
     # Write data to JSON file
-    qc_df.to_json(os.path.join(output_path, 'QC-%s.json' % save_id),
+    qc_df.to_json(os.path.join(output_path, f'QC-{save_id}.json'),
                   orient='records')
 
     #  Load only QC vals. TODO use a new variabile name to avoid confusion
@@ -528,60 +585,58 @@ def agg_func(x):
 
 
 def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
-                     data_path, write_traces, figure_size, reversal_potential,
-                     save_id):
+                     data_path, figure_size, reversal_potential, save_id):
     # TODO: Tidy up argument order
     """
     ???
     """
+    print(f'EXTRACT PROTOCOL {savename}')
+    #logging.info(f"Exporting {readname} as {savename}")
 
-    logging.info(f"extracting {savename}")
-
+    savedir = os.path.join(savedir, '2-extract-protocol')
     traces_dir = os.path.join(savedir, 'traces')
     os.makedirs(traces_dir, exist_ok=True)
-
-    row_dict = {}
-
     subtraction_plots_dir = os.path.join(savedir, 'subtraction_plots')
     os.makedirs(subtraction_plots_dir, exist_ok=True)
 
-    logging.info(f"Exporting {readname} as {savename}")
+    row_dict = {}
 
-    filepath_before = os.path.join(data_path, f'{readname}_{time_strs[0]}')
-    filepath_after = os.path.join(data_path, f'{readname}_{time_strs[1]}')
-    json_file_before = f"{readname}_{time_strs[0]}"
-    json_file_after = f"{readname}_{time_strs[1]}"
-    before_trace = Trace(filepath_before, json_file_before)
-    after_trace = Trace(filepath_after, json_file_after)
+    before_trace = Trace(
+        os.path.join(data_path, f'{readname}_{time_strs[0]}'),
+        f'{readname}_{time_strs[0]}.json')
+    after_trace = Trace(
+        os.path.join(data_path, f'{readname}_{time_strs[1]}'),
+        f'{readname}_{time_strs[1]}')
 
-    voltage_protocol = before_trace.get_voltage_protocol()
+    # Get times and voltages
     times = before_trace.get_times()
     voltages = before_trace.get_voltage()
+    assert len(times) == len(after_trace.get_times())
+    assert np.all(np.abs(times - after_trace.get_times()) < 1.8)
+    assert np.all(np.abs(voltages - after_trace.get_voltage()) < 1.8)
 
     #  Find start of leak section
+    voltage_protocol = before_trace.get_voltage_protocol()
     desc = voltage_protocol.get_all_sections()
     ramp_bounds = detect_ramp_bounds(times, desc)
 
-    nsweeps_before = before_trace.NofSweeps = 2
-    nsweeps_after = after_trace.NofSweeps = 2
-
-    assert nsweeps_before == nsweeps_after
-
-    # Time points
-    times_before = before_trace.get_times()
-    times_after = after_trace.get_times()
+    nsweeps = before_trace.NofSweeps
+    assert nsweeps == after_trace.NofSweeps
 
-    try:
-        assert all(np.abs(times_before - times_after) < 1e-8)
-    except Exception as exc:
-        logging.warning(f"Exception thrown when handling {savename}: ", str(exc))
-        return
-
-    header = "\"current\""
+    # Store voltages and times, using 2 different libraries...
+    voltage_df = pd.DataFrame(
+        np.vstack((times.flatten(), voltages.flatten())).T,
+        columns=['time', 'voltage'])
+    voltage_df.to_csv(os.path.join(
+        traces_dir, f"{save_id}-{savename}-voltages.csv"))
+    #np.savetxt(os.path.join(traces_dir, f"{save_id}-{savename}-times.csv"),
+    #           times)
 
     qc_before = before_trace.get_onboard_QC_values()
     qc_after = after_trace.get_onboard_QC_values()
-    qc_vals_all = before_trace.get_onboard_QC_values()
+
+    before_data = before_trace.get_trace_sweeps()
+    after_data = after_trace.get_trace_sweeps()
 
     for i_well, well in enumerate(selected_wells):  # Go through all wells
         if i_well % 24 == 0:
@@ -590,45 +645,16 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
         if None in qc_before[well] or None in qc_after[well]:
             continue
 
-        if write_traces:
-            # Save 'before drug' trace as .csv
-            for sweep in range(nsweeps_before):
-                out = before_trace.get_trace_sweeps([sweep])[well][0]
-                save_fname = os.path.join(traces_dir, f"{save_id}-{savename}-"
-                                          f"{well}-before-sweep{sweep}.csv")
-
-                np.savetxt(save_fname, out, delimiter=',',
-                           header=header)
-
-        if write_traces:
-            # Save 'after drug' trace as .csv
-            for sweep in range(nsweeps_after):
-                save_fname = os.path.join(traces_dir, f"{save_id}-{savename}-"
-                                          f"{well}-after-sweep{sweep}.csv")
-                out = after_trace.get_trace_sweeps([sweep])[well][0]
-                if len(out) > 0:
-                    np.savetxt(save_fname, out,
-                               delimiter=',', comments='', header=header)
-
-    voltage_before = before_trace.get_voltage()
-    voltage_after = after_trace.get_voltage()
-
-    assert len(voltage_before) == len(voltage_after)
-    assert len(voltage_before) == len(times_before)
-    assert len(voltage_after) == len(times_after)
-    voltage = voltage_before
-
-    voltage_df = pd.DataFrame(np.vstack((times_before.flatten(),
-                                         voltage.flatten())).T,
-                              columns=['time', 'voltage'])
-
-    if not os.path.exists(os.path.join(traces_dir,
-                                       f"{save_id}-{savename}-voltages.csv")):
-        voltage_df.to_csv(os.path.join(traces_dir,
-                                       f"{save_id}-{savename}-voltages.csv"))
-
-    np.savetxt(os.path.join(traces_dir, f"{save_id}-{savename}-times.csv"),
-               times_before)
+        # Save before and after drug traces as .csv
+        for sweep in range(nsweeps):
+            save_fname = os.path.join(
+                traces_dir, f'{save_id}-{savename}-{well}-before-sweep{sweep}.csv')
+            np.savetxt(save_fname, before_data[well][sweep], delimiter=',',
+                       header='"current"')
+            save_fname = os.path.join(
+                traces_dir, f'{save_id}-{savename}-{well}-after-sweep{sweep}.csv')
+            np.savetxt(save_fname, after_data[well][sweep], delimiter=',',
+                       header='"current"')
 
     # plot subtraction
     fig = plt.figure(figsize=figure_size, layout='constrained')
@@ -648,21 +674,21 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     os.makedirs(out1, exist_ok=True)
     os.makedirs(out2, exist_ok=True)
 
+
+    hergqc = hERGQC(voltages, before_trace.sampling_rate)
     for well in selected_wells:
-        before_current = before_trace.get_trace_sweeps()[well]
-        after_current = after_trace.get_trace_sweeps()[well]
 
         before_leak_currents = []
         after_leak_currents = []
 
-        for sweep in range(before_current.shape[0]):
+        for sweep in range(nsweeps):
             row_dict = {
                 'well': well,
                 'sweep': sweep,
                 'protocol': savename
             }
 
-            qc_vals = qc_vals_all[well][sweep]
+            qc_vals = qc_before[well][sweep]
             if qc_vals is None:
                 continue
             if len(qc_vals) == 0:
@@ -673,7 +699,7 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             row_dict['Rseries'] = qc_vals[2]
 
             before_params, before_leak = fit_linear_leak(
-                before_current[sweep], voltages, times, *ramp_bounds,
+                before_data[well][sweep], voltages, times, *ramp_bounds,
                 save_fname=os.path.join(out1, f'{well}_sweep{sweep}.png'))
             before_leak_currents.append(before_leak)
 
@@ -682,7 +708,7 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             row_dict['E_leak_before'] = -before_params[0] / before_params[1]
 
             after_params, after_leak = fit_linear_leak(
-                after_current[sweep, :], voltages, times, *ramp_bounds,
+                after_data[well][sweep], voltages, times, *ramp_bounds,
                 save_fname=os.path.join(out2, f'{well}_sweep{sweep}.png'))
 
             after_leak_currents.append(after_leak)
@@ -691,10 +717,10 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             row_dict['gleak_after'] = after_params[1]
             row_dict['E_leak_after'] = -after_params[0] / after_params[1]
 
-            subtracted_trace = before_current[sweep, :] - before_leak\
-                - (after_current[sweep, :] - after_leak)
-            after_corrected = after_current[sweep, :] - after_leak
-            before_corrected = before_current[sweep, :] - before_leak
+            subtracted_trace = before_data[well][sweep] - before_leak\
+                - (after_data[well][sweep] - after_leak)
+            after_corrected = after_data[well][sweep] - after_leak
+            before_corrected = before_data[well][sweep] - before_leak
 
             E_rev_before = infer_reversal_potential(
                 before_corrected, times, desc, voltages,
@@ -728,7 +754,7 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
 
             # Check QC6 for each protocol (not just the staircase)
 
-            hergqc = hERGQC(voltage, before_trace.sampling_rate)
+
 
             times = before_trace.get_times()
             voltage = before_trace.get_voltage()
@@ -758,11 +784,10 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
 
             row_dict['QC4'] = all([x for x, _ in qc4])
 
-            if write_traces:
-                out_fname = os.path.join(traces_dir,
-                                         f"{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv")
+            out_fname = os.path.join(traces_dir,
+                                     f"{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv")
+            np.savetxt(out_fname, subtracted_trace.flatten())
 
-                np.savetxt(out_fname, subtracted_trace.flatten())
             rows.append(row_dict)
 
             param, leak = fit_linear_leak(current, voltage, times,
@@ -794,12 +819,9 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     times = before_trace.get_times()
     voltages = before_trace.get_voltage()
 
-    before_current_all = before_trace.get_trace_sweeps()
-    after_current_all = after_trace.get_trace_sweeps()
-
     for well in selected_wells:
-        before_current = before_current_all[well]
-        after_current = after_current_all[well]
+        before_current = before_data[well]
+        after_current = after_data[well]
 
         before_leak_currents = before_leak_current_dict[well]
         after_leak_currents = after_leak_current_dict[well]
@@ -821,181 +843,140 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
 
     plt.close(fig)
 
-    protocol_dir = os.path.join(traces_dir, 'protocols')
-    os.makedirs(protocol_dir, exist_ok=True)
-
-    # extract protocol
-    protocol = before_trace.get_voltage_protocol()
-    protocol.export_txt(os.path.join(protocol_dir,
-                                     f'{save_id}-{savename}.txt'))
-
-    json_protocol = before_trace.get_voltage_protocol_json()
-
-    with open(os.path.join(protocol_dir, f'{save_id}-{savename}.json'), 'w') as fout:
-        json.dump(json_protocol, fout)
-
     return extract_df
 
 
 def run_qc_for_protocol(readname, savename, time_strs, output_path,
                         data_path, wells, write_traces, save_id):
     """
-    ???
+    Runs a QC procedure for a single protocol, on the selected wells.
+
+    Assumes:
+    - time_strs has length 2, corresponding to a before- and after-drug trace
+    - each recording has 2 sweeps
+
+    The following procedure is followed:
+    - Traces are leak corrected with `fit_linear_leak`
+    - Traces are drug subtracted
+    - No capacitative spike filtering
+
+    Also:
+    - Creates a directory ``output_path/1-qc``
+    - Writes leak subtracted traces at ``1-qc/save_id-...csv``
+    - Makes leak correction plots at ``output_path/1-qc/leak_correction``.
+      These are the plots made by `fit_linear_leak()`.
+
+    @param readname The protocol name, without the time part
+    @param savename The shorter name for the protocol
+    @param time_strs The time part of the protocol names, must have length 2
+    @param output_path The root directory to chuck everything in
+    @param data_path The path to read data from
+    @param wells The wells to process
+    @param write_traces True if traces should be written too
+    @param save_id
+
+    Returns a tuple `(selected_wells, detailed_qc)`` where ``selected_wells``
+    is a list containing all selected well names, and ``detailed_qc`` is a
+    pandas dataframe containing the pass/fail status of all individual QC
+    criteria, for all wells.
     """
-    # TODO Tidy up argument order
-
-    df_rows = []
+    print(f'RUN HERG QC FOR {readname}, {time_strs}, {savename}')
 
+    # TODO Tidy up argument order
     assert len(time_strs) == 2
 
-    filepath_before = os.path.join(data_path, f'{readname}_{time_strs[0]}')
-    json_file_before = f"{readname}_{time_strs[0]}"
-
-    filepath_after = os.path.join(data_path, f'{readname}_{time_strs[1]}')
-    json_file_after = f"{readname}_{time_strs[1]}"
-
-    logging.debug(f"loading {json_file_after} and {json_file_before}")
+    before_trace = Trace(
+        os.path.join(data_path, f'{readname}_{time_strs[0]}'),
+        f'{readname}_{time_strs[0]}.json')
+    after_trace = Trace(
+        os.path.join(data_path, f'{readname}_{time_strs[1]}'),
+        f'{readname}_{time_strs[1]}.json')
 
-    before_trace = Trace(filepath_before, json_file_before)
-    after_trace = Trace(filepath_after, json_file_after)
-    assert before_trace.sampling_rate == after_trace.sampling_rate
-
-    # Convert to s
+    # Assert that protocols are exactly the same
     sampling_rate = before_trace.sampling_rate
-
-    savedir = os.path.join(output_path, 'QC')
-    plot_dir = os.path.join(savedir, 'leak_correction')
+    assert sampling_rate == after_trace.sampling_rate
+    voltage = before_trace.get_voltage()
+    assert np.all(voltage == after_trace.get_voltage())
+    hergqc = hERGQC(voltage, sampling_rate)
+
+    # Store stuff to "QC/leak_correction"
+    save_dir = os.path.join(output_path, '1-qc')
+    plot_dir = os.path.join(save_dir, 'leak_correction')
     os.makedirs(plot_dir, exist_ok=True)
 
-    before_voltage = before_trace.get_voltage()
-    after_voltage = after_trace.get_voltage()
-
-    # Assert that protocols are exactly the same
-    assert np.all(before_voltage == after_voltage)
-
-    voltage = before_voltage
-
+    # Assume two sweeps!
     sweeps = [0, 1]
-    raw_before_all = before_trace.get_trace_sweeps(sweeps)
-    raw_after_all = after_trace.get_trace_sweeps(sweeps)
-
-    selected_wells = []
-
-    hergqc = hERGQC(before_voltage, sampling_rate)
+    nsweeps = len(sweeps)
+    raw_before = before_trace.get_trace_sweeps(sweeps)
+    raw_after = after_trace.get_trace_sweeps(sweeps)
 
     qc_before = before_trace.get_onboard_QC_values()
     qc_after = after_trace.get_onboard_QC_values()
 
+    # Get ramp bounds
+    times = before_trace.get_times()
+    nsamples = len(times)
+    v_protocol = VoltageProtocol.from_voltage_trace(voltage, times)
+    v_sections = v_protocol.get_all_sections()
+    ramp_bounds = detect_ramp_bounds(times, v_sections)
+
+    df_rows = []
+    selected_wells = []
     for well in wells:
         # Check if any cell first!
         if (None in qc_before[well][0]) or (None in qc_after[well][0]):
-            no_cell = True
             continue
-        else:
-            no_cell = False
-
-        nsweeps = before_trace.NofSweeps
-        assert after_trace.NofSweeps == nsweeps
 
-        before_currents_corrected = np.empty((nsweeps, before_trace.NofSamples))
-        after_currents_corrected = np.empty((nsweeps, after_trace.NofSamples))
-
-        before_currents = np.empty((nsweeps, before_trace.NofSamples))
-        after_currents = np.empty((nsweeps, after_trace.NofSamples))
-
-        # Get ramp times from protocol description
-        voltage_protocol = VoltageProtocol.from_voltage_trace(
-            voltage, before_trace.get_times())
-
-        #  Find start of leak section
-        desc = voltage_protocol.get_all_sections()
-        ramp_locs = np.argwhere(desc[:, 2] != desc[:, 3]).flatten()
-        tstart = desc[ramp_locs[0], 0]
-        tend = voltage_protocol.get_ramps()[0][1]
-
-        times = before_trace.get_times()
-
-        ramp_bounds = [np.argmax(times > tstart), np.argmax(times > tend)]
-
-        assert after_trace.NofSamples == before_trace.NofSamples
-
-        for sweep in range(nsweeps):
-            before_raw = np.array(raw_before_all[well])[sweep, :]
-            after_raw = np.array(raw_after_all[well])[sweep, :]
-
-            before_params1, before_leak = fit_linear_leak(
-                before_raw, voltage, times, *ramp_bounds,
+        corrected_before = np.empty((nsweeps, nsamples))
+        corrected_after = np.empty((nsweeps, nsamples))
+        for isweep in range(nsweeps):
+            # Get corrected traces, and make plot
+            _, before_leak = fit_linear_leak(
+                raw_before[well][isweep], voltage, times, *ramp_bounds,
                 save_fname=os.path.join(
-                    plot_dir, f'{well}-sweep{sweep}-before.png'))
-
-            after_params1, after_leak = fit_linear_leak(
-                after_raw, voltage, times, *ramp_bounds,
+                    plot_dir, f'{savename}-{well}-{isweep}-before.png'))
+            _, after_leak = fit_linear_leak(
+                raw_after[well][isweep], voltage, times, *ramp_bounds,
                 save_fname=os.path.join(
-                    plot_dir, f'{well}-sweep{sweep}-after.png'))
-
-            before_currents_corrected[sweep, :] = before_raw - before_leak
-            after_currents_corrected[sweep, :] = after_raw - after_leak
+                    plot_dir, f'{savename}-{well}-{isweep}-after.png'))
 
-            before_currents[sweep, :] = before_raw
-            after_currents[sweep, :] = after_raw
-
-        logging.info(f"{well} {savename}\n----------")
-        logging.info(f"sampling_rate is {sampling_rate}")
-
-        voltage_steps = [tend
-                         for tstart, tend, vstart, vend in
-                         voltage_protocol.get_all_sections() if vend == vstart]
+            corrected_before[isweep] = raw_before[well][isweep] - before_leak
+            corrected_after[isweep] = raw_after[well][isweep] - after_leak
 
         # Run QC with raw currents
         QC = hergqc.run_qc(
-            voltage_steps,
-            times,
-            before_currents_corrected,
-            after_currents_corrected,
-            np.array(qc_before[well])[0, :],
-            np.array(qc_after[well])[0, :],
-            nsweeps,
-        )
+            [v0 for t0, t1, v0, v1 in v_sections if v1 == v0],
+            times, corrected_before, corrected_after,
+            qc_before[well][0], qc_after[well][0], nsweeps)
 
         df_rows.append([well] + QC.passed_list())
 
-        selected = QC.all_passed() and not no_cell
-        if selected:
+        if QC.all_passed():
             selected_wells.append(well)
 
         # Save subtracted current in csv file
-        header = "\"current\""
-
-        for i in range(nsweeps):
-
-            savepath = os.path.join(savedir,
-                                    f'{save_id}-{savename}-{well}-sweep{i}.csv')
-            subtracted_current = before_currents_corrected[i, :] - after_currents_corrected[i, :]
-
-            if write_traces:
-                os.makedirs(savedir, exist_ok=True)
-
-                np.savetxt(savepath, subtracted_current, delimiter=',',
-                           comments='', header=header)
-
+        if write_traces:
+            header = '"current"'
+            for i in range(nsweeps):
+                savepath = os.path.join(
+                    save_dir, f'{save_id}-{savename}-{well}-sweep{i}.csv')
+                subtracted = corrected_before[i] - corrected_after[i]
+                np.savetxt(savepath, subtracted, delimiter=',', header=header)
+
+    # TODO: Depends on QC used
     column_labels = ['well', 'qc1.rseal', 'qc1.cm', 'qc1.rseries', 'qc2.raw',
                      'qc2.subtracted', 'qc3.raw', 'qc3.E4031', 'qc3.subtracted',
                      'qc4.rseal', 'qc4.cm', 'qc4.rseries', 'qc5.staircase',
                      'qc5.1.staircase', 'qc6.subtracted', 'qc6.1.subtracted',
                      'qc6.2.subtracted']
 
-    df = pd.DataFrame(np.array(df_rows), columns=column_labels)
-
-    missing_wells_dfs = []
-    # Add onboard qc to dataframe
-    for well in wells:
-        if well not in df['well'].values:
-            onboard_qc_df = pd.DataFrame([[well] + [False for col in
-                                                    list(df)[1:]]],
-                                         columns=list(df))
-            missing_wells_dfs.append(onboard_qc_df)
-    df = pd.concat([df] + missing_wells_dfs, ignore_index=True)
+    # Add skipped wells
+    ncriteria = len(column_labels) - 1
+    for well in set(wells) - set([row[0] for row in df_rows]):
+        df_rows.append([well] + [False] * ncriteria)
 
+    # Create data frame and return
+    df = pd.DataFrame(np.array(df_rows), columns=column_labels)
     df['protocol'] = savename
 
     return selected_wells, df
@@ -1003,6 +984,36 @@ def run_qc_for_protocol(readname, savename, time_strs, output_path,
 
 def qc3_bookend(readname, savename, time_strs, wells, output_path,
                 data_path, figure_size, save_id):
+    """
+    Joey's "bookend" test, comparing a staircase before other protocols with a
+    staircase after.
+
+    This method assumes 4 time strings are given, for 4 files with 2 sweeps
+    each. It loads data, performs leak correction, drug subtraction, and
+    capacitance filtering, before comparing the first sweep of the first
+    staircase with the second sweep of the second staircase.
+
+    Also:
+    - Creates a directory ``output_path/3-qc3-bookend``
+    - Creates plots comparing staircase 1, sweep 1, with staircase 2, sweep 2.
+
+    TODO: This method repeats lots of steps, unneccesarily:
+    - loading all data
+    - leak correction
+    - drug subtraction
+
+    @param readname The protocol name, without the time part
+    @param savename The shorter name for the protocol
+    @param time_strs The time part of the protocol names, must have length 4
+    @param wells The wells to process
+    @param output_path The root directory to chuck everything in
+    @param data_path The path to read data from
+    @param figure_size
+
+    Returns a dictionary mapping well names to booleans.
+    """
+    print('RUN QC3 bookend')
+    assert len(time_strs) == 4
 
     filepath_first_before = os.path.join(data_path, f'{readname}_{time_strs[0]}')
     filepath_last_before = os.path.join(data_path, f'{readname}_{time_strs[1]}')
@@ -1010,17 +1021,16 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
     json_file_last_before = f"{readname}_{time_strs[1]}"
 
     #  Each Trace object contains two sweeps
-    first_before_trace = Trace(filepath_first_before,
-                               json_file_first_before)
-    last_before_trace = Trace(filepath_last_before,
-                              json_file_last_before)
+    first_before_trace = Trace(filepath_first_before, json_file_first_before)
+    last_before_trace = Trace(filepath_last_before, json_file_last_before)
 
     times = first_before_trace.get_times()
     voltage = first_before_trace.get_voltage()
 
     voltage_protocol = first_before_trace.get_voltage_protocol()
-    ramp_bounds = detect_ramp_bounds(times,
-                                     voltage_protocol.get_all_sections())
+    ramp_bounds = detect_ramp_bounds(
+        times, voltage_protocol.get_all_sections())
+
     filepath_first_after = os.path.join(data_path, f'{readname}_{time_strs[2]}')
     filepath_last_after = os.path.join(data_path, f'{readname}_{time_strs[3]}')
     json_file_first_after = f"{readname}_{time_strs[2]}"
@@ -1043,81 +1053,50 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
     last_before_current_dict = last_before_trace.get_trace_sweeps()
     last_after_current_dict = last_after_trace.get_trace_sweeps()
 
-    #  Do leak subtraction and store traces for each well
-    #  TODO Refactor this code into a single loop. There's no need to store each individual trace.
-    before_traces_first = {}
-    before_traces_last = {}
-    after_traces_first = {}
-    after_traces_last = {}
-    first_processed = {}
-    last_processed = {}
-
-    #  Iterate over all wells
-    for well in np.array(wells).flatten():
-        first_before_current = first_before_current_dict[well][0, :]
-        first_after_current = first_after_current_dict[well][0, :]
-        last_before_current = last_before_current_dict[well][-1, :]
-        last_after_current = last_after_current_dict[well][-1, :]
-
-        save_fname = f"{well}_{savename}_before0.pdf"
-
-        before_traces_first[well] = get_leak_corrected(first_before_current,
-                                                       voltage, times,
-                                                       *ramp_bounds)
-
-        before_traces_last[well] = get_leak_corrected(last_before_current,
-                                                      voltage, times,
-                                                      *ramp_bounds)
-
-        after_traces_first[well] = get_leak_corrected(first_after_current,
-                                                      voltage, times,
-                                                      *ramp_bounds)
-        after_traces_last[well] = get_leak_corrected(last_after_current,
-                                                     voltage, times,
-                                                     *ramp_bounds)
-
-        # Store subtracted traces
-        first_processed[well] = before_traces_first[well] - after_traces_first[well]
-        last_processed[well] = before_traces_last[well] - after_traces_last[well]
-
-    voltage_protocol = VoltageProtocol.from_voltage_trace(voltage, times)
+    # Get steps for capacitance filtering
+    voltage_steps = [tstart
+                     for tstart, tend, vstart, vend in
+                     voltage_protocol.get_all_sections() if vend == vstart]
 
-    plot_dir = os.path.join(output_path, output_path,
-                            f'{save_id}-{savename}-qc3-bookend')
-    os.makedirs(plot_dir, exist_ok=True)
+    # Create QC object. Only uses qc3 which doesn't plot, so no plot_dir needed
     hergqc = hERGQC(sampling_rate=first_before_trace.sampling_rate,
-                    plot_dir=plot_dir,
                     voltage=voltage)
 
-    assert first_before_trace.NofSweeps == last_before_trace.NofSweeps
-
-    voltage_steps = [tstart
-                     for tstart, tend, vstart, vend in
-                     voltage_protocol.get_all_sections() if vend == vstart]
-    res_dict = {}
+    # Update output path
+    output_path = os.path.join(output_path, '3-qc3-bookend')
+    os.makedirs(output_path, exist_ok=True)
 
+    # Create figure - will be reused
     fig = plt.figure(figsize=figure_size)
     ax = fig.subplots()
-    for well in wells:
-        trace1 = hergqc.filter_capacitive_spikes(
-            first_processed[well], times, voltage_steps
-        ).flatten()
-
-        trace2 = hergqc.filter_capacitive_spikes(
-            last_processed[well], times, voltage_steps
-        ).flatten()
-
-        passed = hergqc.qc3(trace1, trace2)[0]
-
-        res_dict[well] = passed
-
-        save_fname = os.path.join(output_path, 'qc3_bookend')
 
+    #  Iterate over all wells, perform qc3-bookend, plot and store
+    res_dict = {}
+    for well in np.array(wells).flatten():
+        # First staircase, before drug and after drug, first sweep
+        before_trace_first = get_leak_corrected(
+            first_before_current_dict[well][0], voltage, times, *ramp_bounds)
+        after_trace_first = get_leak_corrected(
+            first_after_current_dict[well][0], voltage, times, *ramp_bounds)
+        first = before_trace_first - after_trace_first
+
+        # Second staircase, before drug and after drug, second sweep
+        before_trace_last = get_leak_corrected(
+            last_before_current_dict[well][-1], voltage, times, *ramp_bounds)
+        after_trace_last = get_leak_corrected(
+            last_after_current_dict[well][-1], voltage, times, *ramp_bounds)
+        last = before_trace_last - after_trace_last
+
+        trace1 = hergqc.filter_capacitive_spikes(first, times, voltage_steps)
+        trace2 = hergqc.filter_capacitive_spikes(last, times, voltage_steps)
+
+        res_dict[well] = hergqc.qc3(trace1, trace2)[0]
+
+        save_fname = os.path.join(output_path, f'qc3_bookend-{well}.png')
+        ax.cla()
         ax.plot(times, trace1)
         ax.plot(times, trace2)
-
         fig.savefig(save_fname)
-        ax.cla()
 
     plt.close(fig)
     return res_dict
diff --git a/pcpostprocess/scripts/summarise_herg_export.py b/pcpostprocess/scripts/summarise_herg_export.py
index 9cbdc0c..d8969d9 100644
--- a/pcpostprocess/scripts/summarise_herg_export.py
+++ b/pcpostprocess/scripts/summarise_herg_export.py
@@ -49,9 +49,6 @@ def run(data_path, output_path, experiment_name, reversal_potential=None,
     @param reversal_potential The calculated reversal potential, or ``None``
     @param figsize The matplotlib figure size, or ``None``.
     """
-    # TODO: Find some way around setting this
-    matplotlib.use('Agg')
-
     output_path = setup_output_directory(output_path)
 
     leak_parameters_df = pd.read_csv(os.path.join(data_path, 'subtraction_qc.csv'))

From b8b5604dd5c8dbc1fbb39d8bc8f5d83d4dbaf66c Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Wed, 19 Nov 2025 18:49:20 +0000
Subject: [PATCH 2/9] Working through run-herg-qc

---
 pcpostprocess/scripts/run_herg_qc.py | 93 ++++++++++++++--------------
 1 file changed, 47 insertions(+), 46 deletions(-)

diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index 0d4df65..fe343e7 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -320,62 +320,63 @@ def pront(*args):
     # Now go over _all_ protocols, including the QC protocols (AGAIN!), and
     # call extract_protocol() on them
     #
-    if write_traces:
+    pront(savenames, readnames, times_list)
 
-        pront(savenames, readnames, times_list)
-
-        # Export all protocols
-        savenames, readnames, times_list = [], [], []
-        for protocol in res_dict:
-
-            # Sort into chronological order
-            times = sorted(res_dict[protocol])
-            savename = combined_dict[protocol]
-
-            readnames.append(protocol)
+    # Export all protocols
+    savenames, readnames, times_list = [], [], []
+    for protocol in res_dict:
 
-            if len(times) == 2:
-                savenames.append(savename)
-                times_list.append(times)
+        # Sort into chronological order
+        times = sorted(res_dict[protocol])
+        savename = combined_dict[protocol]
 
-            elif len(times) == 4:
-                savenames.append(savename)
-                times_list.append(times[::2])
+        readnames.append(protocol)
 
-                # Make seperate savename for protocol repeat
-                savename = combined_dict[protocol] + '_2'
-                assert savename not in combined_dict.values()
-                savenames.append(savename)
-                times_list.append(times[1::2])
-                readnames.append(protocol)
+        if len(times) == 2:
+            savenames.append(savename)
+            times_list.append(times)
 
-            # TODO Else raise error?
+        elif len(times) == 4:
+            savenames.append(savename)
+            times_list.append(times[::2])
 
+            # Make seperate savename for protocol repeat
+            savename = combined_dict[protocol] + '_2'
+            assert savename not in combined_dict.values()
+            savenames.append(savename)
+            times_list.append(times[1::2])
+            readnames.append(protocol)
 
-        pront(savenames, readnames, times_list)
+        # TODO Else raise error?
 
-        wells_to_export = wells if write_failed_traces else selection
-        logging.info(f'exporting wells {wells_to_export}')
-        m = len(readnames)
-        n = min(max_processes, m)
-        args = zip(readnames, savenames, times_list, [wells_to_export] * m,
-                   [output_path] * m, [data_path] * m, [figure_size] * m,
-                   [reversal_potential] * m, [save_id] * m)
-        dfs = starmap(n, extract_protocol, args)
 
-        pront(len(dfs))
-        for df in dfs:
-            pront(df)
-        sys.exit(1)
+    pront(savenames, readnames, times_list)
 
-        if dfs:
-            extract_df = pd.concat(dfs, ignore_index=True)
-            extract_df['selected'] = extract_df['well'].isin(selection)
-        else:
-            logging.error("Didn't export any data")
-            return
+    wells_to_export = wells if write_failed_traces else selection
+    logging.info(f'exporting wells {wells_to_export}')
+    m = len(readnames)
+    n = min(max_processes, m)
+    args = zip(readnames, savenames, times_list, [wells_to_export] * m,
+               [output_path] * m, [data_path] * m, [figure_size] * m,
+               [reversal_potential] * m, [save_id] * m)
+    dfs = starmap(n, extract_protocol, args)
+
+    pd.set_option('display.max_columns', None)
+    pd.set_option('display.max_rows', None)
+
+    pront(len(dfs))
+    for df in dfs:
+        pront(df)
+    sys.exit(1)
+
+    if dfs:
+        extract_df = pd.concat(dfs, ignore_index=True)
+        extract_df['selected'] = extract_df['well'].isin(selection)
+    else:
+        logging.error("Didn't export any data")
+        return
 
-        logging.info(f"extract_df: {extract_df}")
+    logging.info(f"extract_df: {extract_df}")
 
     #
     # Do QC3 on first staircase, first sweep VS second staircase, second sweep
@@ -638,7 +639,7 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     before_data = before_trace.get_trace_sweeps()
     after_data = after_trace.get_trace_sweeps()
 
-    for i_well, well in enumerate(selected_wells):  # Go through all wells
+    for i_well, well in enumerate(selected_wells):
         if i_well % 24 == 0:
             logging.info('row ' + well[0])
 

From 80858fb3a8f39d6398373fd73274ae300936a701 Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Wed, 19 Nov 2025 19:07:38 +0000
Subject: [PATCH 3/9] Working through run-herg-qc

---
 pcpostprocess/scripts/run_herg_qc.py | 41 ++++------------------------
 1 file changed, 5 insertions(+), 36 deletions(-)

diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index fe343e7..b9e6c50 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -273,8 +273,7 @@ def pront(*args):
 
     m = len(readnames)
     n = min(max_processes, m)
-    args = zip(readnames, savenames, times_list, [output_path] * m,
-               [data_path] * m, [wells] * m, [write_traces] * m, [save_id] * m)
+    args = zip(readnames, savenames, times_list, [data_path] * m, [wells] * m)
     well_selections, qc_dfs = zip(*starmap(n, run_qc_for_protocol, args))
 
     pront(well_selections)
@@ -594,7 +593,6 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     print(f'EXTRACT PROTOCOL {savename}')
     #logging.info(f"Exporting {readname} as {savename}")
 
-    savedir = os.path.join(savedir, '2-extract-protocol')
     traces_dir = os.path.join(savedir, 'traces')
     os.makedirs(traces_dir, exist_ok=True)
     subtraction_plots_dir = os.path.join(savedir, 'subtraction_plots')
@@ -847,8 +845,7 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     return extract_df
 
 
-def run_qc_for_protocol(readname, savename, time_strs, output_path,
-                        data_path, wells, write_traces, save_id):
+def run_qc_for_protocol(readname, savename, time_strs, data_path, wells):
     """
     Runs a QC procedure for a single protocol, on the selected wells.
 
@@ -861,20 +858,12 @@ def run_qc_for_protocol(readname, savename, time_strs, output_path,
     - Traces are drug subtracted
     - No capacitative spike filtering
 
-    Also:
-    - Creates a directory ``output_path/1-qc``
-    - Writes leak subtracted traces at ``1-qc/save_id-...csv``
-    - Makes leak correction plots at ``output_path/1-qc/leak_correction``.
-      These are the plots made by `fit_linear_leak()`.
 
     @param readname The protocol name, without the time part
     @param savename The shorter name for the protocol
     @param time_strs The time part of the protocol names, must have length 2
-    @param output_path The root directory to chuck everything in
     @param data_path The path to read data from
     @param wells The wells to process
-    @param write_traces True if traces should be written too
-    @param save_id
 
     Returns a tuple `(selected_wells, detailed_qc)`` where ``selected_wells``
     is a list containing all selected well names, and ``detailed_qc`` is a
@@ -900,11 +889,6 @@ def run_qc_for_protocol(readname, savename, time_strs, output_path,
     assert np.all(voltage == after_trace.get_voltage())
     hergqc = hERGQC(voltage, sampling_rate)
 
-    # Store stuff to "QC/leak_correction"
-    save_dir = os.path.join(output_path, '1-qc')
-    plot_dir = os.path.join(save_dir, 'leak_correction')
-    os.makedirs(plot_dir, exist_ok=True)
-
     # Assume two sweeps!
     sweeps = [0, 1]
     nsweeps = len(sweeps)
@@ -931,16 +915,10 @@ def run_qc_for_protocol(readname, savename, time_strs, output_path,
         corrected_before = np.empty((nsweeps, nsamples))
         corrected_after = np.empty((nsweeps, nsamples))
         for isweep in range(nsweeps):
-            # Get corrected traces, and make plot
             _, before_leak = fit_linear_leak(
-                raw_before[well][isweep], voltage, times, *ramp_bounds,
-                save_fname=os.path.join(
-                    plot_dir, f'{savename}-{well}-{isweep}-before.png'))
+                raw_before[well][isweep], voltage, times, *ramp_bounds)
             _, after_leak = fit_linear_leak(
-                raw_after[well][isweep], voltage, times, *ramp_bounds,
-                save_fname=os.path.join(
-                    plot_dir, f'{savename}-{well}-{isweep}-after.png'))
-
+                raw_after[well][isweep], voltage, times, *ramp_bounds)
             corrected_before[isweep] = raw_before[well][isweep] - before_leak
             corrected_after[isweep] = raw_after[well][isweep] - after_leak
 
@@ -955,15 +933,6 @@ def run_qc_for_protocol(readname, savename, time_strs, output_path,
         if QC.all_passed():
             selected_wells.append(well)
 
-        # Save subtracted current in csv file
-        if write_traces:
-            header = '"current"'
-            for i in range(nsweeps):
-                savepath = os.path.join(
-                    save_dir, f'{save_id}-{savename}-{well}-sweep{i}.csv')
-                subtracted = corrected_before[i] - corrected_after[i]
-                np.savetxt(savepath, subtracted, delimiter=',', header=header)
-
     # TODO: Depends on QC used
     column_labels = ['well', 'qc1.rseal', 'qc1.cm', 'qc1.rseries', 'qc2.raw',
                      'qc2.subtracted', 'qc3.raw', 'qc3.E4031', 'qc3.subtracted',
@@ -1154,7 +1123,7 @@ def fit_func(x, args=None):
     ]
 
     # TESTING ONLY
-    # np.random.seed(1)
+    np.random.seed(1)
 
     #  Repeat optimisation with different starting guesses
     x0s = [[np.random.uniform(lower_b, upper_b) for lower_b, upper_b in bounds] for i in range(100)]

From ed789c8d8b39281453898c0f948daf946130f729 Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Wed, 19 Nov 2025 19:39:56 +0000
Subject: [PATCH 4/9] Working through run-herg-qc

---
 pcpostprocess/scripts/run_herg_qc.py | 50 ++++++++++++++++++----------
 1 file changed, 33 insertions(+), 17 deletions(-)

diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index b9e6c50..f90ed04 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -28,6 +28,23 @@
 from pcpostprocess.subtraction_plots import do_subtraction_plot
 
 
+def write_csv(data, path, *args, header=None):
+    """
+    Writes a numpy array ``data`` to ``path``.
+
+    Optional extra arguments are used for longer names::
+
+        write_csv(x, 'path', 'to', 'file.csv')
+
+    writes ``x`` to ``path/to/file.csv``.
+    """
+    if header is not None:
+        if not header.startswith('"'):
+            header = f'"{header}"'
+    path = os.path.join(path, *args)
+    np.savetxt(path, data, delimiter=',', comments='', header=header or '')
+
+
 def starmap(n, func, iterable):
     """
     Like ``multiprocessing.Pool.starmap``, but does not use subprocesses when
@@ -628,8 +645,8 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
         columns=['time', 'voltage'])
     voltage_df.to_csv(os.path.join(
         traces_dir, f"{save_id}-{savename}-voltages.csv"))
-    #np.savetxt(os.path.join(traces_dir, f"{save_id}-{savename}-times.csv"),
-    #           times)
+    #write_csv(times, traces_dir, f'{save_id}-{savename}-times.csv')
+
 
     qc_before = before_trace.get_onboard_QC_values()
     qc_after = after_trace.get_onboard_QC_values()
@@ -646,14 +663,14 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
 
         # Save before and after drug traces as .csv
         for sweep in range(nsweeps):
-            save_fname = os.path.join(
-                traces_dir, f'{save_id}-{savename}-{well}-before-sweep{sweep}.csv')
-            np.savetxt(save_fname, before_data[well][sweep], delimiter=',',
-                       header='"current"')
-            save_fname = os.path.join(
-                traces_dir, f'{save_id}-{savename}-{well}-after-sweep{sweep}.csv')
-            np.savetxt(save_fname, after_data[well][sweep], delimiter=',',
-                       header='"current"')
+            write_csv(
+                before_data[well][sweep], traces_dir,
+                f'{save_id}-{savename}-{well}-before-sweep{sweep}.csv',
+                header='current')
+            write_csv(
+                after_data[well][sweep], traces_dir,
+                f'{save_id}-{savename}-{well}-after-sweep{sweep}.csv',
+                header='current')
 
     # plot subtraction
     fig = plt.figure(figsize=figure_size, layout='constrained')
@@ -783,14 +800,12 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
 
             row_dict['QC4'] = all([x for x, _ in qc4])
 
-            out_fname = os.path.join(traces_dir,
-                                     f"{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv")
-            np.savetxt(out_fname, subtracted_trace.flatten())
+            write_csv(
+                subtracted_trace, traces_dir,
+                f'{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv')
 
-            rows.append(row_dict)
-
-            param, leak = fit_linear_leak(current, voltage, times,
-                                          *ramp_bounds)
+            param, leak = fit_linear_leak(
+                current, voltage, times, *ramp_bounds)
 
             subtracted_trace = current - leak
 
@@ -808,6 +823,7 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             row_dict['-120mV decay time constant 2'] = res[0][1]
             row_dict['-120mV decay time constant 3'] = res[1]
             row_dict['-120mV peak current'] = res[2]
+            rows.append(row_dict)
 
         before_leak_current_dict[well] = np.vstack(before_leak_currents)
         after_leak_current_dict[well] = np.vstack(after_leak_currents)

From 2e9b85ff1553b43520fdc90be3e66638829d5377 Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Thu, 20 Nov 2025 16:44:01 +0000
Subject: [PATCH 5/9] Working through run-herg-qc

---
 pcpostprocess/scripts/run_herg_qc.py | 471 ++++++++++++---------------
 1 file changed, 208 insertions(+), 263 deletions(-)

diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index f90ed04..3f5f302 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -210,7 +210,6 @@ def run(data_path, output_path, qc_map, wells=None,
     # TODO: Replace this by looping over qc_map and write_map?
     res_dict = {}
     for dirname in os.listdir(data_path):
-        print(dirname, os.path.basename(dirname))
         dirname = os.path.basename(dirname)
         match = protocols_regex.match(dirname)
 
@@ -235,7 +234,6 @@ def pront(*args):
         for arg in args:
             print(arg)
 
-    pront(res_dict)
     # At this point, despite its name, res_dict is not a dictionary of results,
     # but a map of QC protocol names onto lists of times (see comment above)
 
@@ -252,7 +250,6 @@ def pront(*args):
             continue
 
         times = sorted(res_dict[protocol])
-        pront(times)
 
         savename = qc_map[protocol]
 
@@ -283,8 +280,6 @@ def pront(*args):
     #   savenames: short user name, repeat has _2 added on
     #   times_list: [before1, after1], [before2, after2]
 
-    pront(readnames, savenames, times_list)
-
     if not readnames:
         raise ValueError('No compatible protocols specified.')
 
@@ -293,9 +288,6 @@ def pront(*args):
     args = zip(readnames, savenames, times_list, [data_path] * m, [wells] * m)
     well_selections, qc_dfs = zip(*starmap(n, run_qc_for_protocol, args))
 
-    pront(well_selections)
-    pront(qc_dfs)
-
     #
     # Assuming a single QC protocol. At this point, we have
     #
@@ -311,7 +303,6 @@ def pront(*args):
     #
 
     qc_df = pd.concat(qc_dfs, ignore_index=True)
-    pront(qc_df)
 
     #
     # At this point, ``qc_df`` is a single dataframe containing the information
@@ -336,7 +327,6 @@ def pront(*args):
     # Now go over _all_ protocols, including the QC protocols (AGAIN!), and
     # call extract_protocol() on them
     #
-    pront(savenames, readnames, times_list)
 
     # Export all protocols
     savenames, readnames, times_list = [], [], []
@@ -365,9 +355,6 @@ def pront(*args):
 
         # TODO Else raise error?
 
-
-    pront(savenames, readnames, times_list)
-
     wells_to_export = wells if write_failed_traces else selection
     logging.info(f'exporting wells {wells_to_export}')
     m = len(readnames)
@@ -376,24 +363,17 @@ def pront(*args):
                [output_path] * m, [data_path] * m, [figure_size] * m,
                [reversal_potential] * m, [save_id] * m)
     dfs = starmap(n, extract_protocol, args)
+    if not dfs:
+        raise Exception('No data exported')
 
     pd.set_option('display.max_columns', None)
     pd.set_option('display.max_rows', None)
 
-    pront(len(dfs))
-    for df in dfs:
-        pront(df)
+    extract_df = pd.concat(dfs, ignore_index=True)
+    extract_df['selected'] = extract_df['well'].isin(selection)
+    pront(extract_df)
     sys.exit(1)
 
-    if dfs:
-        extract_df = pd.concat(dfs, ignore_index=True)
-        extract_df['selected'] = extract_df['well'].isin(selection)
-    else:
-        logging.error("Didn't export any data")
-        return
-
-    logging.info(f"extract_df: {extract_df}")
-
     #
     # Do QC3 on first staircase, first sweep VS second staircase, second sweep
     # qc3.bookend check very first and very last staircases are similar
@@ -605,18 +585,62 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
                      data_path, figure_size, reversal_potential, save_id):
     # TODO: Tidy up argument order
     """
-    ???
-    """
-    print(f'EXTRACT PROTOCOL {savename}')
-    #logging.info(f"Exporting {readname} as {savename}")
-
-    traces_dir = os.path.join(savedir, 'traces')
-    os.makedirs(traces_dir, exist_ok=True)
-    subtraction_plots_dir = os.path.join(savedir, 'subtraction_plots')
-    os.makedirs(subtraction_plots_dir, exist_ok=True)
+    Extracts a (non-QC) protocol, performs more QC(!), and
+
+    Loads the data (again!) and
+    - leak-subtracts it using the leak step
+    - stores before, after, and subtracted traces in ``traces`` (for all
+      sweeps), along with time and voltage traces
+
+
+    - creates and returns a data frame, with entries detailed below
+
+    The dataframe has entries
+    - ``well``
+    - ``sweep``
+    - ``protocol`` (savename)
+    - ``Rseal`` The "before" value
+    - ``Cm `` The "before" value
+    - ``Rseries`` The "before" value
+    - ``gleak_before`` Leak current parameters
+    - ``E_leak_before``
+    - ``gleak_after ``
+    - ``E_leak_after``
+    - ``R_leftover ``
+    - ``QC.R_leftover ``
+    - ``E_rev ``
+    - ``E_rev_before ``
+    - ``E_rev_after ``
+    - ``QC.Erev ``
+    - ``QC6 ``
+    - ``QC1 ``
+    - ``QC4 ``
+    - ``total before-drug flux``
+    - ``-120mV decay time constant 1``
+    - ``-120mV decay time constant 2``
+    - ``-120mV decay time constant 3``
+    - ``-120mV peak current``
 
-    row_dict = {}
+    Also:
+    - Creates leak subtraction plots, in ``savedir/subtraction_plots``
+    - Creates reversal ramp plots, in ``savedir/reversal_plots/``
+    - Creates leak correction plots, before and after drug, in
+      ``savedir/leak_correction/saveid-savename-leak_fit-before/after``.
 
+    """
+    logging.info(f"Exporting {readname} as {savename}")
+
+    trace_dir = os.path.join(savedir, 'traces')
+    subtraction_dir = os.path.join(savedir, 'subtraction_plots')
+    reversal_dir = os.path.join(savedir, 'reversal_plots')
+    leak_dir1 = os.path.join(savedir, 'leak_correction',
+                             f'{save_id}-{savename}-leak_fit-before')
+    leak_dir2 = os.path.join(savedir, 'leak_correction',
+                             f'{save_id}-{savename}-leak_fit-after')
+    for d in (subtraction_dir, reversal_dir, trace_dir, leak_dir1, leak_dir2):
+        os.makedirs(d, exist_ok=True)
+
+    # Load data
     before_trace = Trace(
         os.path.join(data_path, f'{readname}_{time_strs[0]}'),
         f'{readname}_{time_strs[0]}.json')
@@ -624,6 +648,10 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
         os.path.join(data_path, f'{readname}_{time_strs[1]}'),
         f'{readname}_{time_strs[1]}')
 
+    # Count sweeps
+    nsweeps = before_trace.NofSweeps
+    assert nsweeps == after_trace.NofSweeps
+
     # Get times and voltages
     times = before_trace.get_times()
     voltages = before_trace.get_voltage()
@@ -635,18 +663,16 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     voltage_protocol = before_trace.get_voltage_protocol()
     desc = voltage_protocol.get_all_sections()
     ramp_bounds = detect_ramp_bounds(times, desc)
-
-    nsweeps = before_trace.NofSweeps
-    assert nsweeps == after_trace.NofSweeps
+    voltage_steps = [tstart for tstart, tend, vstart, vend in
+                     voltage_protocol.get_all_sections() if vend == vstart]
 
     # Store voltages and times, using 2 different libraries...
-    voltage_df = pd.DataFrame(
-        np.vstack((times.flatten(), voltages.flatten())).T,
-        columns=['time', 'voltage'])
+    # TODO: REPLACE WITH SINGLE TIMES, SINGLE VOLTAGE
+    voltage_df = pd.DataFrame(np.vstack((times, voltages)).T,
+                              columns=['time', 'voltage'])
     voltage_df.to_csv(os.path.join(
-        traces_dir, f"{save_id}-{savename}-voltages.csv"))
-    #write_csv(times, traces_dir, f'{save_id}-{savename}-times.csv')
-
+        trace_dir, f'{save_id}-{savename}-voltages.csv'))
+    #write_csv(times, trace_dir, f'{save_id}-{savename}-times.csv')
 
     qc_before = before_trace.get_onboard_QC_values()
     qc_after = after_trace.get_onboard_QC_values()
@@ -654,170 +680,130 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     before_data = before_trace.get_trace_sweeps()
     after_data = after_trace.get_trace_sweeps()
 
-    for i_well, well in enumerate(selected_wells):
-        if i_well % 24 == 0:
-            logging.info('row ' + well[0])
-
-        if None in qc_before[well] or None in qc_after[well]:
-            continue
-
-        # Save before and after drug traces as .csv
+    # Save before and after drug traces as .csv
+    for well in selected_wells:
         for sweep in range(nsweeps):
             write_csv(
-                before_data[well][sweep], traces_dir,
+                before_data[well][sweep], trace_dir,
                 f'{save_id}-{savename}-{well}-before-sweep{sweep}.csv',
                 header='current')
             write_csv(
-                after_data[well][sweep], traces_dir,
+                after_data[well][sweep], trace_dir,
                 f'{save_id}-{savename}-{well}-after-sweep{sweep}.csv',
                 header='current')
 
-    # plot subtraction
+    # Reusable figure to plot subtraction
     fig = plt.figure(figsize=figure_size, layout='constrained')
 
-    plot_dir = os.path.join(savedir, 'reversal_plots')
-    os.makedirs(plot_dir, exist_ok=True)
+    # Reusable hergqc object
+    hergqc = hERGQC(voltages, before_trace.sampling_rate)
 
+    # Process wells, create rows for data frame
     rows = []
-
-    before_leak_current_dict = {}
-    after_leak_current_dict = {}
-
-    out1 = os.path.join(savedir, 'leak_correction',
-                        f'{save_id}-{savename}-leak_fit-before')
-    out2 = os.path.join(savedir, 'leak_correction',
-                        f'{save_id}-{savename}-leak_fit-after')
-    os.makedirs(out1, exist_ok=True)
-    os.makedirs(out2, exist_ok=True)
-
-
-    hergqc = hERGQC(voltages, before_trace.sampling_rate)
     for well in selected_wells:
-
-        before_leak_currents = []
-        after_leak_currents = []
-
         for sweep in range(nsweeps):
-            row_dict = {
-                'well': well,
-                'sweep': sweep,
-                'protocol': savename
-            }
+            row_dict = {'well': well, 'sweep': sweep, 'protocol': savename}
 
             qc_vals = qc_before[well][sweep]
-            if qc_vals is None:
-                continue
-            if len(qc_vals) == 0:
-                continue
-
             row_dict['Rseal'] = qc_vals[0]
             row_dict['Cm'] = qc_vals[1]
             row_dict['Rseries'] = qc_vals[2]
 
+            # Fit leak using step (again!)
             before_params, before_leak = fit_linear_leak(
                 before_data[well][sweep], voltages, times, *ramp_bounds,
-                save_fname=os.path.join(out1, f'{well}_sweep{sweep}.png'))
-            before_leak_currents.append(before_leak)
-
-            # Convert linear regression parameters into conductance and reversal
-            row_dict['gleak_before'] = before_params[1]
-            row_dict['E_leak_before'] = -before_params[0] / before_params[1]
-
+                save_fname=os.path.join(leak_dir1, f'{well}_sweep{sweep}.png'))
             after_params, after_leak = fit_linear_leak(
                 after_data[well][sweep], voltages, times, *ramp_bounds,
-                save_fname=os.path.join(out2, f'{well}_sweep{sweep}.png'))
+                save_fname=os.path.join(leak_dir2, f'{well}_sweep{sweep}.png'))
+            before_corrected = before_data[well][sweep] - before_leak
+            after_corrected = after_data[well][sweep] - after_leak
 
-            after_leak_currents.append(after_leak)
+            # Store drug-subtracted trace
+            subtracted = before_corrected - after_corrected
+            write_csv(
+                subtracted, trace_dir,
+                f'{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv')
 
-            # Convert linear regression parameters into conductance and reversal
+            # Create subtraction plot
+            fig = plt.figure(figsize=figure_size, layout='constrained')
+            do_subtraction_plot(
+                fig, times, range(nsweeps), before_data[well], after_data[well],
+                voltages, ramp_bounds, well=well, protocol=savename)
+            fig.savefig(os.path.join(
+                subtraction_dir,
+                f'{save_id}-{savename}-{well}-sweep{sweep}-subtraction'))
+            plt.close(fig)
+
+            # Store conductance and reversal
+            row_dict['gleak_before'] = before_params[1]
+            row_dict['E_leak_before'] = -before_params[0] / before_params[1]
             row_dict['gleak_after'] = after_params[1]
             row_dict['E_leak_after'] = -after_params[0] / after_params[1]
 
-            subtracted_trace = before_data[well][sweep] - before_leak\
-                - (after_data[well][sweep] - after_leak)
-            after_corrected = after_data[well][sweep] - after_leak
-            before_corrected = before_data[well][sweep] - before_leak
+            # Add "R_leftover" QC
+            QC_R_leftover = np.sqrt(np.sum(after_corrected**2) /
+                                    np.sum(before_corrected**2))
+            row_dict['R_leftover'] = QC_R_leftover
+            row_dict['QC.R_leftover'] = np.all(row_dict['R_leftover'] < 0.5)
 
+            # Infer reversal potential and store, creating plots in the process
             E_rev_before = infer_reversal_potential(
                 before_corrected, times, desc, voltages,
                 output_path=os.path.join(
-                    plot_dir, f'{well}_{savename}_sweep{sweep}_before'),
+                    reversal_dir, f'{well}_{savename}_sweep{sweep}_before.png'),
                 known_Erev=reversal_potential)
-
             E_rev_after = infer_reversal_potential(
                 after_corrected, times, desc, voltages,
                 output_path=os.path.join(
-                    plot_dir, f'{well}_{savename}_sweep{sweep}_after'),
+                    reversal_dir, f'{well}_{savename}_sweep{sweep}_after.png'),
                 known_Erev=reversal_potential)
-
             E_rev = infer_reversal_potential(
-                subtracted_trace, times, desc, voltages,
+                subtracted, times, desc, voltages,
                 output_path=os.path.join(
-                    plot_dir, f'{well}_{savename}_sweep{sweep}_subtracted'),
+                    reversal_dir, f'{well}_{savename}_sweep{sweep}_subtracted.png'),
                 known_Erev=reversal_potential)
 
-            row_dict['R_leftover'] =\
-                np.sqrt(np.sum((after_corrected)**2)/(np.sum(before_corrected**2)))
-
-            QC_R_leftover = np.all(row_dict['R_leftover'] < 0.5)
-            row_dict['QC.R_leftover'] = QC_R_leftover
-
+            # Store Erevs parameters
             row_dict['E_rev'] = E_rev
             row_dict['E_rev_before'] = E_rev_before
             row_dict['E_rev_after'] = E_rev_after
 
+            # Add Erev QC
             row_dict['QC.Erev'] = E_rev < -50 and E_rev > -120
 
             # Check QC6 for each protocol (not just the staircase)
+            # For this, use a capacitance-filtered trace
+            filtered = hergqc.filter_capacitive_spikes(
+                subtracted, times, voltage_steps)
+            row_dict['QC6'] = hergqc.qc6(filtered, win=hergqc.qc6_win)[0]
 
-
-
-            times = before_trace.get_times()
-            voltage = before_trace.get_voltage()
-            voltage_protocol = before_trace.get_voltage_protocol()
-
-            voltage_steps = [tstart
-                             for tstart, tend, vstart, vend in
-                             voltage_protocol.get_all_sections() if vend == vstart]
-
-            current = hergqc.filter_capacitive_spikes(before_corrected - after_corrected,
-                                                      times, voltage_steps)
-
-            row_dict['QC6'] = hergqc.qc6(current, win=hergqc.qc6_win)[0]
-
-            #  Assume there is only one sweep for all non-QC protocols
+            # Check QC1 and QC4, but between before and after traces (assume
+            # there is only one sweep for non-QC protocols)
             rseal_before, cm_before, rseries_before = qc_before[well][0]
             rseal_after, cm_after, rseries_after = qc_after[well][0]
-
             qc1_1 = hergqc.qc1(rseal_before, cm_before, rseries_before)
             qc1_2 = hergqc.qc1(rseal_after, cm_after, rseries_after)
-
             row_dict['QC1'] = all([x for x, _ in qc1_1 + qc1_2])
-
             qc4 = hergqc.qc4([rseal_before, rseal_after],
                              [cm_before, cm_after],
                              [rseries_before, rseries_after])
-
             row_dict['QC4'] = all([x for x, _ in qc4])
 
-            write_csv(
-                subtracted_trace, traces_dir,
-                f'{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv')
-
-            param, leak = fit_linear_leak(
-                current, voltage, times, *ramp_bounds)
-
-            subtracted_trace = current - leak
-
-            t_step = times[1] - times[0]
-            row_dict['total before-drug flux'] = np.sum(current) * (1.0 / t_step)
+            # Add a "total flux" value to the data frame, based on the drug
+            # subtracted and capacitance filtered current
+            dt = times[1] - times[0]
+            row_dict['total before-drug flux'] = np.sum(filtered) / dt
+            # TODO: This measure can be dropped
+
+            # Estimate time constants, based on a trace with yet another round
+            # of leak-step correction
+            _, leak = fit_linear_leak(filtered, voltages, times, *ramp_bounds)
+            fname = os.path.join(
+                savedir, '-120mV time constant',
+                f'{savename}-{well}-sweep{sweep}-time-constant-fit.png')
             res = get_time_constant_of_first_decay(
-                subtracted_trace, times, desc,
-                os.path.join(
-                    savedir, 'debug', '-120mV time constant',
-                    f'{savename}-{well}-sweep{sweep}-time-constant-fit.png'),
-                figure_size
-            )
+                filtered - leak, times, desc, fname, figure_size)
 
             row_dict['-120mV decay time constant 1'] = res[0][0]
             row_dict['-120mV decay time constant 2'] = res[0][1]
@@ -825,39 +811,9 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             row_dict['-120mV peak current'] = res[2]
             rows.append(row_dict)
 
-        before_leak_current_dict[well] = np.vstack(before_leak_currents)
-        after_leak_current_dict[well] = np.vstack(after_leak_currents)
-
     extract_df = pd.DataFrame.from_dict(rows)
     logging.debug(extract_df)
 
-    times = before_trace.get_times()
-    voltages = before_trace.get_voltage()
-
-    for well in selected_wells:
-        before_current = before_data[well]
-        after_current = after_data[well]
-
-        before_leak_currents = before_leak_current_dict[well]
-        after_leak_currents = after_leak_current_dict[well]
-
-        sub_df = extract_df[extract_df.well == well]
-
-        if len(sub_df.index) == 0:
-            continue
-
-        sweeps = sorted(list(sub_df.sweep.unique()))
-
-        do_subtraction_plot(fig, times, sweeps, before_current, after_current,
-                            voltages, ramp_bounds, well=well,
-                            protocol=savename)
-
-        fig.savefig(os.path.join(subtraction_plots_dir,
-                                 f'{save_id}-{savename}-{well}-sweep{sweep}-subtraction'))
-        fig.clf()
-
-    plt.close(fig)
-
     return extract_df
 
 
@@ -886,7 +842,6 @@ def run_qc_for_protocol(readname, savename, time_strs, data_path, wells):
     pandas dataframe containing the pass/fail status of all individual QC
     criteria, for all wells.
     """
-    print(f'RUN HERG QC FOR {readname}, {time_strs}, {savename}')
 
     # TODO Tidy up argument order
     assert len(time_strs) == 2
@@ -998,7 +953,6 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
 
     Returns a dictionary mapping well names to booleans.
     """
-    print('RUN QC3 bookend')
     assert len(time_strs) == 4
 
     filepath_first_before = os.path.join(data_path, f'{readname}_{time_strs[0]}')
@@ -1091,6 +1045,8 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
 def get_time_constant_of_first_decay(
         trace, times, protocol_desc, output_path, figure_size):
     """
+    Fits a XXX exponential to the -120mV step following the 40mV step at the
+    start of most protocols.
     ???
     """
     os.makedirs(os.path.dirname(output_path), exist_ok=True)
@@ -1111,45 +1067,40 @@ def get_time_constant_of_first_decay(
 
     indices = np.argwhere((times >= peak_time) & (times <= tend - 50))
 
-    def fit_func(x, args=None):
-        # Pass 'args=single' when we want to use a single exponential.
-        # Otherwise use 2 exponentials
-        if args:
-            single = args == 'single'
-        else:
-            single = False
-
-        if not single:
-            a, b, c, d = x
-            if d < b:
-                b, d = d, b
-            prediction = c * np.exp((-1.0/d) * (times[indices] - peak_time)) + \
-                a * np.exp((-1.0/b) * (times[indices] - peak_time))
-        else:
-            a, b = x
-            prediction = a * np.exp((-1.0/b) * (times[indices] - peak_time))
+    def fit1(x):
+        """ Error function to fit a one exponential. """
+        a, b = x
+        # prediction = a * np.exp((peak_time - times[indices]) / b)
+        # No major difference, but keeps it compatible with original code
+        prediction = a * np.exp((-1.0/b) * (times[indices] - peak_time))
+        return np.sum((prediction - trace[indices])**2)
 
+    def fit2(x):
+        """ Error function to fit two exponentials. """
+        a, b, c, d = x
+        if d < b:
+            b, d = d, b
+        # prediction = (c * np.exp((peak_time - times[indices]) / d) +
+        #              a * np.exp((peak_time - times[indices]) / b))
+        prediction = (c * np.exp((-1.0/d) * (times[indices] - peak_time)) +
+                      a * np.exp((-1.0/b) * (times[indices] - peak_time)))
         return np.sum((prediction - trace[indices])**2)
 
-    bounds = [
-        (-np.abs(trace).max()*2, 0),
-        (1e-12, 5e3),
-        (-np.abs(trace).max()*2, 0),
-        (1e-12, 5e3),
-    ]
+    a_bounds = (-np.abs(trace).max()*2, 0)
+    b_bounds = (1e-12, 5e3)
 
     # TESTING ONLY
     np.random.seed(1)
 
     #  Repeat optimisation with different starting guesses
-    x0s = [[np.random.uniform(lower_b, upper_b) for lower_b, upper_b in bounds] for i in range(100)]
-
+    bounds = (a_bounds, b_bounds, a_bounds, b_bounds)
+    x0s = [[np.random.uniform(*b) for b in bounds] for i in range(100)]
+    # Not required, but keeps it compatible with original code
     x0s = [[a, b, c, d] if d < b else [a, d, c, b] for (a, b, c, d) in x0s]
 
     best_res = None
     for x0 in x0s:
-        res = scipy.optimize.minimize(fit_func, x0=x0,
-                                      bounds=bounds)
+        res = scipy.optimize.minimize(fit2, x0=x0, bounds=bounds)
         if best_res is None:
             best_res = res
         elif res.fun < best_res.fun and res.success and res.fun != 0:
@@ -1157,18 +1108,12 @@ def fit_func(x, args=None):
     res1 = best_res
 
     #  Re-run with single exponential
-    bounds = [
-        (-np.abs(trace).max()*2, 0),
-        (1e-12, 5e3),
-    ]
-
-    #  Repeat optimisation with different starting guesses
-    x0s = [[np.random.uniform(lower_b, upper_b) for lower_b, upper_b in bounds] for i in range(100)]
+    bounds = (a_bounds, b_bounds)
+    x0s = [[np.random.uniform(*b) for b in bounds] for i in range(100)]
 
     best_res = None
     for x0 in x0s:
-        res = scipy.optimize.minimize(fit_func, x0=x0,
-                                      bounds=bounds, args=('single',))
+        res = scipy.optimize.minimize(fit1, x0=x0, bounds=bounds)
         if best_res is None:
             best_res = res
         elif res.fun < best_res.fun and res.success and res.fun != 0:
@@ -1176,75 +1121,75 @@ def fit_func(x, args=None):
     res2 = best_res
 
     if not res2:
-        logging.warning('finding 120mv decay timeconstant failed:' + str(res))
+        logging.warning(f'finding 120mv decay timeconstant failed: {res}')
 
-    if output_path and res:
-        fig = plt.figure(figsize=figure_size, constrained_layout=True)
-        axs = fig.subplots(2)
+    # Create a plot
+    fig = plt.figure(figsize=figure_size, constrained_layout=True)
+    axs = fig.subplots(2)
 
-        for ax in axs:
-            ax.spines[['top', 'right']].set_visible(False)
-            ax.set_ylabel(r'$I_\mathrm{obs}$ (pA)')
+    for ax in axs:
+        ax.spines[['top', 'right']].set_visible(False)
+        ax.set_ylabel(r'$I_\mathrm{obs}$ (pA)')
 
-        axs[-1].set_xlabel(r'$t$ (ms)')
+    axs[-1].set_xlabel(r'$t$ (ms)')
 
-        protocol_ax, fit_ax = axs
-        protocol_ax.set_title('a', fontweight='bold', loc='left')
-        fit_ax.set_title('b', fontweight='bold', loc='left')
-        fit_ax.plot(peak_time, peak_current, marker='x', color='red')
+    protocol_ax, fit_ax = axs
+    protocol_ax.set_title('a', fontweight='bold', loc='left')
+    fit_ax.set_title('b', fontweight='bold', loc='left')
+    fit_ax.plot(peak_time, peak_current, marker='x', color='red')
 
-        a, b, c, d = res1.x
+    a, b, c, d = res1.x
 
-        if d < b:
-            b, d = d, b
+    if d < b:
+        b, d = d, b
 
-        e, f = res2.x
+    e, f = res2.x
 
-        fit_ax.plot(times[indices], trace[indices], color='grey',
-                    alpha=.5)
-        fit_ax.plot(times[indices], c * np.exp((-1.0/d) * (times[indices] - peak_time))
-                    + a * np.exp(-(1.0/b) * (times[indices] - peak_time)),
-                    color='red', linestyle='--')
+    fit_ax.plot(times[indices], trace[indices], color='grey',
+                alpha=.5)
+    fit_ax.plot(times[indices], c * np.exp((-1.0/d) * (times[indices] - peak_time))
+                + a * np.exp(-(1.0/b) * (times[indices] - peak_time)),
+                color='red', linestyle='--')
 
-        res_string = r'$\tau_{1} = ' f"{d:.1f}" r'\mathrm{ms}'\
-            r'\; \tau_{2} = ' f"{b:.1f}" r'\mathrm{ms}$'
+    res_string = r'$\tau_{1} = ' f"{d:.1f}" r'\mathrm{ms}'\
+        r'\; \tau_{2} = ' f"{b:.1f}" r'\mathrm{ms}$'
 
-        fit_ax.annotate(res_string, xy=(0.5, 0.05), xycoords='axes fraction')
+    fit_ax.annotate(res_string, xy=(0.5, 0.05), xycoords='axes fraction')
 
-        protocol_ax.plot(times, trace)
-        protocol_ax.axvspan(peak_time, tend - 50, alpha=.5, color='grey')
+    protocol_ax.plot(times, trace)
+    protocol_ax.axvspan(peak_time, tend - 50, alpha=.5, color='grey')
 
-        fig.savefig(output_path)
-        fit_ax.set_yscale('symlog')
+    fig.savefig(output_path)
+    fit_ax.set_yscale('symlog')
 
-        dirname, filename = os.path.split(output_path)
-        filename = 'log10_' + filename
-        fig.savefig(os.path.join(dirname, filename))
+    dirname, filename = os.path.split(output_path)
+    filename = 'log10_' + filename
+    fig.savefig(os.path.join(dirname, filename))
 
-        fit_ax.cla()
+    fit_ax.cla()
 
-        dirname, filename = os.path.split(output_path)
-        filename = 'single_exp_' + filename
-        output_path = os.path.join(dirname, filename)
+    dirname, filename = os.path.split(output_path)
+    filename = 'single_exp_' + filename
+    output_path = os.path.join(dirname, filename)
 
-        fit_ax.plot(times[indices], trace[indices], color='grey',
-                    alpha=.5)
-        fit_ax.plot(times[indices], e * np.exp((-1.0/f) * (times[indices] - peak_time)),
-                    color='red', linestyle='--')
+    fit_ax.plot(times[indices], trace[indices], color='grey',
+                alpha=.5)
+    fit_ax.plot(times[indices], e * np.exp((-1.0/f) * (times[indices] - peak_time)),
+                color='red', linestyle='--')
 
-        res_string = r'$\tau = ' f"{f:.1f}" r'\mathrm{ms}$'
+    res_string = r'$\tau = ' f"{f:.1f}" r'\mathrm{ms}$'
 
-        fit_ax.annotate(res_string, xy=(0.5, 0.05), xycoords='axes fraction')
-        fig.savefig(output_path)
+    fit_ax.annotate(res_string, xy=(0.5, 0.05), xycoords='axes fraction')
+    fig.savefig(output_path)
 
-        dirname, filename = os.path.split(output_path)
-        filename = 'log10_' + filename
-        fit_ax.set_yscale('symlog')
-        fig.savefig(os.path.join(dirname, filename))
+    dirname, filename = os.path.split(output_path)
+    filename = 'log10_' + filename
+    fit_ax.set_yscale('symlog')
+    fig.savefig(os.path.join(dirname, filename))
 
-        plt.close(fig)
+    plt.close(fig)
 
-    return (d, b), f, peak_current if res else (np.nan, np.nan), np.nan, peak_current
+    return (d, b), f, peak_current if res1 else (np.nan, np.nan), np.nan, peak_current
 
 
 if __name__ == '__main__':  # pragma: no cover

From 546ff4866e3a2d7afbd2fa4cfa068924e235a46b Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Wed, 7 Jan 2026 16:09:35 +0000
Subject: [PATCH 6/9] Tidying run_herg_qc script

---
 pcpostprocess/scripts/run_herg_qc.py | 328 +++++++++++++++------------
 1 file changed, 187 insertions(+), 141 deletions(-)

diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index 3f5f302..eaadd8b 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -135,9 +135,14 @@ def run(data_path, output_path, qc_map, wells=None,
 
     This proceeds with the following steps:
 
-    1. All wells (or those selected with ``wells``) are run through "plain" QC,
-       for every protocol in ``qc_map``.
-
+    1. All wells (or those selected with ``wells``) are run through staircase
+       QC, for every protocol in ``qc_map``. These protocols are expected to be
+       run either 2 times (before all other protocols, with and without IKr
+       blocker) or 4 times (before _and_ after all other protocols). This test
+       is performed by :meth:`run_staircase_qc`.
+    2. The list of wells that pass staircase QC are then stored in a file
+       ``selected-{save_id}-txt``.
+    3. For all protocols in either ``qc_map`` or
 
 
 
@@ -145,19 +150,22 @@ def run(data_path, output_path, qc_map, wells=None,
 
     @param data_path The path to read data from
     @param output_path The path to write output to
-    @param qc_map A dictionary mapping input protocol names to output names.
-           All protocols in this dictionary will be used for quality control.
+    @param qc_map A dictionary mapping staircase protocol names to output
+           names. All protocols in this dictionary will be used for staircase
+           quality control and export.
     @param wells A list of strings indicating the wells to use, or ``None`` for
            all wells.
     @param write_traces Set to ``True`` to write (raw and processed) traces to
            the output directory in CSV format.
     @param write_failed_traces Set to ``True`` to write traces for wells
            failing quality control.  Ignored if ``write_traces=False`.
-    @param write_map A dictionary like ``qc_map``, but specifying protocols to
-           write traces for without using them in quality control. Ignored if
+    @param write_map A dictionary like ``qc_map``, but specifying non-staircase
+           protocols to run generic QC on and export. Ignored if
            ``write_traces=False`.
     @param reversal_potential The calculated reversal potential, in mV.
-    @param reversal_spread_threshold The maximum reversal
+    @param reversal_spread_threshold The maximum range of reversal potential
+           values in all tested protocols (``qc_map`` and ``write_map`` if
+           ``write_traces=True``).
     @param max_processes The maximum number of processes to run simultaneously
     @param figure_size An optional tuple specifying the size of figures to
            create
@@ -229,29 +237,35 @@ def run(data_path, output_path, qc_map, wells=None,
 
         res_dict[protocol_name].append(time)
 
-    def pront(*args):
-        print('********')
+    #TEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMP
+
+    def pront(*args, t=None):
+        if t is None:
+            print('********')
+        else:
+            print(f'******** {t} ********')
         for arg in args:
             print(arg)
 
+    pd.set_option('display.max_columns', None)
+    pd.set_option('display.max_rows', None)
+
+    #TEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMP
+
     # At this point, despite its name, res_dict is not a dictionary of results,
     # but a map of QC protocol names onto lists of times (see comment above)
 
     #
-    # Prepare arguments to call `run_qc_for_protocol`
+    # Prepare arguments to call `run_staircase_qc`
     #
 
-    combined_dict = qc_map | write_map
-
     # Select QC protocols and times
     readnames, savenames, times_list = [], [], []
     for protocol in res_dict:
         if protocol not in qc_map:
             continue
-
-        times = sorted(res_dict[protocol])
-
         savename = qc_map[protocol]
+        times = sorted(res_dict[protocol])
 
         if len(times) == 2:
             savenames.append(savename)
@@ -264,14 +278,14 @@ def pront(*args):
             times_list.append([times[0], times[2]])
 
             # Make seperate savename for protocol repeat
-            savename = combined_dict[protocol] + '_2'
+            savename = qc_map[protocol] + '_2'
             assert savename not in write_map.values()
             savenames.append(savename)
             times_list.append([times[1], times[3]])
             readnames.append(protocol)
 
         else:
-            raise ValueError('Expecting 2 or 4 repeats of the QC protocol')
+            raise Exception('Expecting QC protocol run 2 or 4 times')
 
     # For two repeats, we now have
     #   savenames: short user name, one per QC protocol
@@ -286,7 +300,7 @@ def pront(*args):
     m = len(readnames)
     n = min(max_processes, m)
     args = zip(readnames, savenames, times_list, [data_path] * m, [wells] * m)
-    well_selections, qc_dfs = zip(*starmap(n, run_qc_for_protocol, args))
+    well_selections, qc_dfs = zip(*starmap(n, run_staircase_qc, args))
 
     #
     # Assuming a single QC protocol. At this point, we have
@@ -310,11 +324,11 @@ def pront(*args):
     # where for the second run it has _2 appended
     #
 
-    # Combine QC protocls into overall_selection
-    selection = [set(x) for x in well_selections]
-    selection = selection[0].intersection(*selection[1:])
+    # Get set of wells passing both s1 and s2 QC
+    selection = set(well_selections[0]).intersection(set(well_selections[1]))
 
-    # Store "plain QC" selections in "selected" files
+    # Store wells selected by "plain QC" in files starting "selected-"
+    # TODO: name something more specific, e.g. passed-staircase-{save_id}.txt
     fname = os.path.join(output_path, f'selected-{save_id}.txt')
     with open(fname, 'w') as f:
         f.write('\n'.join(selection))
@@ -324,19 +338,19 @@ def pront(*args):
             f.write('\n'.join(partial))
 
     #
-    # Now go over _all_ protocols, including the QC protocols (AGAIN!), and
-    # call extract_protocol() on them
+    # Now go over _all_ protocols, including the QC protocols (again!), and
+    # call run_generic_qc() on them
     #
+    combined_dict = qc_map | write_map
 
     # Export all protocols
     savenames, readnames, times_list = [], [], []
     for protocol in res_dict:
 
         # Sort into chronological order
-        times = sorted(res_dict[protocol])
         savename = combined_dict[protocol]
-
         readnames.append(protocol)
+        times = sorted(res_dict[protocol])
 
         if len(times) == 2:
             savenames.append(savename)
@@ -350,10 +364,11 @@ def pront(*args):
             savename = combined_dict[protocol] + '_2'
             assert savename not in combined_dict.values()
             savenames.append(savename)
-            times_list.append(times[1::2])
             readnames.append(protocol)
+            times_list.append(times[1::2])
 
-        # TODO Else raise error?
+        else:
+            raise Exception('Expecting QC protocol run 2 or 4 times')
 
     wells_to_export = wells if write_failed_traces else selection
     logging.info(f'exporting wells {wells_to_export}')
@@ -362,64 +377,89 @@ def pront(*args):
     args = zip(readnames, savenames, times_list, [wells_to_export] * m,
                [output_path] * m, [data_path] * m, [figure_size] * m,
                [reversal_potential] * m, [save_id] * m)
-    dfs = starmap(n, extract_protocol, args)
+    dfs = starmap(n, run_generic_qc, args)
     if not dfs:
         raise Exception('No data exported')
 
-    pd.set_option('display.max_columns', None)
-    pd.set_option('display.max_rows', None)
+    #
+    # At this point, dfs is a list containing one dataframe _per protocol_. If
+    # write_failed_traces is set, each dataframe contains information on all
+    # wells, if not, each contains information only on wells passing
+    # staircase QC.
+    #
 
     extract_df = pd.concat(dfs, ignore_index=True)
     extract_df['selected'] = extract_df['well'].isin(selection)
-    pront(extract_df)
-    sys.exit(1)
 
     #
-    # Do QC3 on first staircase, first sweep VS second staircase, second sweep
-    # qc3.bookend check very first and very last staircases are similar
+    # At this point, extract_df is a big dataframe with information about all
+    # protocols for all wells (write_failed_traces=True) or just the wells
+    # passing staircase QC (write_failed_traces=False).
+    # It inludes new calculated quantities such as ``Erev`` and ``R_leftover``.
     #
 
+    #
+    # QC3-bookend: Run QC3 on first sweep of first staircase VS second sweep of
+    # second staircase: this checks if the very first and very last staircases
+    # are similar (after drug subtraction)
+    #
+    # This is run on all wells, regardless of staircase QC.
+    #
     protocol, savename = list(qc_map.items())[0]
     times = sorted(res_dict[protocol])
     if len(times) == 4:
+        # If we have staircase-other-staircase before & after drug
         qc3_bookend_dict = qc3_bookend(
             protocol, savename, times, wells, output_path, data_path,
             figure_size, save_id)
-    else:
-        #TODO: Better indicate that it wasn't run?
+    elif len(times) == 2:
+        # TODO: Better to indicate that it wasn't run?
         qc3_bookend_dict = {well: True for well in qc_df.well.unique()}
+    else:
+        raise Exception('Expecting QC protocol run 2 or 4 times')
+
+    # Store the results by adding an extra column to qc_df
     qc_df['qc3.bookend'] = [qc3_bookend_dict[well] for well in qc_df.well]
-    pront(qc_df)
+    del(qc3_bookend_dict)
 
     #
     #
     #
 
-    qc_erev_spread = {}
     erev_spreads = {}
+    qc_erev_spread = {}
     passed_qc_dict = {}
     for well in extract_df.well.unique():
-        logging.info(f"Checking QC for well {well}")
-        # Select only this well
+        logging.info(f'Checking final QC for well {well}')
+
+        # Get extract_df and qc_df for only this well
         sub_df = extract_df[extract_df.well == well]
         sub_qc_df = qc_df[qc_df.well == well]
 
-        passed_qc3_bookend = np.all(sub_qc_df['qc3.bookend'].values)
-        logging.info(f"passed_QC3_bookend_all {passed_qc3_bookend}")
+        # Check if generic QC passed for all protocols for this well
         passed_QC_Erev_all = np.all(sub_df['QC.Erev'].values)
         passed_QC1_all = np.all(sub_df.QC1.values)
-        logging.info(f"passed_QC1_all {passed_QC1_all}")
-
         passed_QC4_all = np.all(sub_df.QC4.values)
-        logging.info(f"passed_QC4_all {passed_QC4_all}")
         passed_QC6_all = np.all(sub_df.QC6.values)
-        logging.info(f"passed_QC6_all {passed_QC1_all}")
+        passed_qc3_bookend = np.all(sub_qc_df['qc3.bookend'].values)
+        passed_QC_Erev_all = np.all(sub_df['QC.Erev'].values)
+        logging.info(f'passed_QC1_all {passed_QC1_all}')
+        logging.info(f'passed_QC4_all {passed_QC4_all}')
+        logging.info(f'passed_QC6_all {passed_QC6_all}')
+        logging.info(f'passed_QC3_bookend_all {passed_qc3_bookend}')
+        logging.info(f'passed_QC_Erev_all {passed_QC_Erev_all}')
+
 
+        # Perform "QC_Erev_spread": this passes if the range of E_rev values in
+        # all protocols is below a threshold
+        # QC Erev spread: check spread in reversal potential isn't too large
         E_revs = sub_df['E_rev'].values.flatten().astype(np.float64)
         E_rev_spread = E_revs.max() - E_revs.min()
-        # QC Erev spread: check spread in reversal potential isn't too large
         passed_QC_Erev_spread = E_rev_spread <= reversal_spread_threshold
-        logging.info(f"passed_QC_Erev_spread {passed_QC_Erev_spread}")
+        logging.info(f'passed_QC_Erev_spread {passed_QC_Erev_spread}')
+        del(E_revs)
+        erev_spreads[well] = E_rev_spread
+        qc_erev_spread[well] = passed_QC_Erev_spread
 
         # R_leftover only considered for protocols used for QC (i.e. staircase protocols)
         passed_QC_R_leftover = np.all(sub_df[sub_df.protocol.isin(qc_map.values())]
@@ -427,29 +467,31 @@ def pront(*args):
 
         logging.info(f"passed_QC_R_leftover {passed_QC_R_leftover}")
 
-        passed_QC_Erev_spread = E_rev_spread <= reversal_spread_threshold
-
-        qc_erev_spread[well] = passed_QC_Erev_spread
-        erev_spreads[well] = E_rev_spread
 
-        passed_QC_Erev_all = np.all(sub_df['QC.Erev'].values)
-        logging.info(f"passed_QC_Erev_all {passed_QC_Erev_all}")
+        passed_staircase_QC = np.all(sub_df['selected'].values)
 
-        was_selected = np.all(sub_df['selected'].values)
-
-        passed_qc = passed_qc3_bookend and was_selected\
-            and passed_QC_Erev_all and passed_QC6_all\
-            and passed_QC_Erev_spread and passed_QC1_all\
+        passed_qc = (
+            passed_staircase_QC
+            and passed_QC1_all
             and passed_QC4_all
+            and passed_QC6_all
+            and passed_qc3_bookend
+            and passed_QC_Erev_all
+            and passed_QC_Erev_spread
+        )
 
         passed_qc_dict[well] = passed_qc
 
+    pront(extract_df, t='extract_df, pre')
     extract_df['passed QC'] = [passed_qc_dict[well] for well in extract_df.well]
     extract_df['QC.Erev.spread'] = [qc_erev_spread[well] for well in extract_df.well]
     extract_df['Erev_spread'] = [erev_spreads[well] for well in extract_df.well]
+    pront(extract_df, t='extract_df, post')
+    sys.exit(1)
 
     chrono_dict = {times[0]: prot for prot, times in zip(savenames, times_list)}
 
+    pront('CHRONO', os.path.join(output_path, 'chrono.txt'))
     with open(os.path.join(output_path, 'chrono.txt'), 'w') as fout:
         for key in sorted(chrono_dict):
             val = chrono_dict[key]
@@ -581,18 +623,18 @@ def agg_func(x):
     return ret_df
 
 
-def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
-                     data_path, figure_size, reversal_potential, save_id):
-    # TODO: Tidy up argument order
+def run_generic_qc(readname, savename, time_strs, selected_wells, savedir,
+                   data_path, figure_size, reversal_potential, save_id):
     """
-    Extracts a (non-QC) protocol, performs more QC(!), and
+    Performs QC on a protocol (staircase or other), and extracts and stores the
+    data.
 
     Loads the data (again!) and
     - leak-subtracts it using the leak step
     - stores before, after, and subtracted traces in ``traces`` (for all
       sweeps), along with time and voltage traces
-
-
+    - performs QC (1, 4, 6, Erev, R_leftover) using the meta data, leak step,
+      and up-down step after the leak step
     - creates and returns a data frame, with entries detailed below
 
     The dataframe has entries
@@ -628,7 +670,7 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
       ``savedir/leak_correction/saveid-savename-leak_fit-before/after``.
 
     """
-    logging.info(f"Exporting {readname} as {savename}")
+    logging.info(f'Exporting {readname} as {savename}')
 
     trace_dir = os.path.join(savedir, 'traces')
     subtraction_dir = os.path.join(savedir, 'subtraction_plots')
@@ -702,12 +744,12 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     rows = []
     for well in selected_wells:
         for sweep in range(nsweeps):
-            row_dict = {'well': well, 'sweep': sweep, 'protocol': savename}
+            row = {'well': well, 'sweep': sweep, 'protocol': savename}
 
             qc_vals = qc_before[well][sweep]
-            row_dict['Rseal'] = qc_vals[0]
-            row_dict['Cm'] = qc_vals[1]
-            row_dict['Rseries'] = qc_vals[2]
+            row['Rseal'] = qc_vals[0]
+            row['Cm'] = qc_vals[1]
+            row['Rseries'] = qc_vals[2]
 
             # Fit leak using step (again!)
             before_params, before_leak = fit_linear_leak(
@@ -736,16 +778,16 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             plt.close(fig)
 
             # Store conductance and reversal
-            row_dict['gleak_before'] = before_params[1]
-            row_dict['E_leak_before'] = -before_params[0] / before_params[1]
-            row_dict['gleak_after'] = after_params[1]
-            row_dict['E_leak_after'] = -after_params[0] / after_params[1]
+            row['gleak_before'] = before_params[1]
+            row['E_leak_before'] = -before_params[0] / before_params[1]
+            row['gleak_after'] = after_params[1]
+            row['E_leak_after'] = -after_params[0] / after_params[1]
 
             # Add "R_leftover" QC
             QC_R_leftover = np.sqrt(np.sum(after_corrected**2) /
                                     np.sum(before_corrected**2))
-            row_dict['R_leftover'] = QC_R_leftover
-            row_dict['QC.R_leftover'] = np.all(row_dict['R_leftover'] < 0.5)
+            row['R_leftover'] = QC_R_leftover
+            row['QC.R_leftover'] = np.all(row['R_leftover'] < 0.5)
 
             # Infer reversal potential and store, creating plots in the process
             E_rev_before = infer_reversal_potential(
@@ -765,18 +807,18 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
                 known_Erev=reversal_potential)
 
             # Store Erevs parameters
-            row_dict['E_rev'] = E_rev
-            row_dict['E_rev_before'] = E_rev_before
-            row_dict['E_rev_after'] = E_rev_after
+            row['E_rev'] = E_rev
+            row['E_rev_before'] = E_rev_before
+            row['E_rev_after'] = E_rev_after
 
             # Add Erev QC
-            row_dict['QC.Erev'] = E_rev < -50 and E_rev > -120
+            row['QC.Erev'] = E_rev < -50 and E_rev > -120
 
             # Check QC6 for each protocol (not just the staircase)
             # For this, use a capacitance-filtered trace
             filtered = hergqc.filter_capacitive_spikes(
                 subtracted, times, voltage_steps)
-            row_dict['QC6'] = hergqc.qc6(filtered, win=hergqc.qc6_win)[0]
+            row['QC6'] = hergqc.qc6(filtered, win=hergqc.qc6_win)[0]
 
             # Check QC1 and QC4, but between before and after traces (assume
             # there is only one sweep for non-QC protocols)
@@ -784,16 +826,16 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             rseal_after, cm_after, rseries_after = qc_after[well][0]
             qc1_1 = hergqc.qc1(rseal_before, cm_before, rseries_before)
             qc1_2 = hergqc.qc1(rseal_after, cm_after, rseries_after)
-            row_dict['QC1'] = all([x for x, _ in qc1_1 + qc1_2])
+            row['QC1'] = all([x for x, _ in qc1_1 + qc1_2])
             qc4 = hergqc.qc4([rseal_before, rseal_after],
                              [cm_before, cm_after],
                              [rseries_before, rseries_after])
-            row_dict['QC4'] = all([x for x, _ in qc4])
+            row['QC4'] = all([x for x, _ in qc4])
 
             # Add a "total flux" value to the data frame, based on the drug
             # subtracted and capacitance filtered current
             dt = times[1] - times[0]
-            row_dict['total before-drug flux'] = np.sum(filtered) / dt
+            row['total before-drug flux'] = np.sum(filtered) / dt
             # TODO: This measure can be dropped
 
             # Estimate time constants, based on a trace with yet another round
@@ -805,11 +847,11 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
             res = get_time_constant_of_first_decay(
                 filtered - leak, times, desc, fname, figure_size)
 
-            row_dict['-120mV decay time constant 1'] = res[0][0]
-            row_dict['-120mV decay time constant 2'] = res[0][1]
-            row_dict['-120mV decay time constant 3'] = res[1]
-            row_dict['-120mV peak current'] = res[2]
-            rows.append(row_dict)
+            row['-120mV decay time constant 1'] = res[0][0]
+            row['-120mV decay time constant 2'] = res[0][1]
+            row['-120mV decay time constant 3'] = res[1]
+            row['-120mV peak current'] = res[2]
+            rows.append(row)
 
     extract_df = pd.DataFrame.from_dict(rows)
     logging.debug(extract_df)
@@ -817,9 +859,9 @@ def extract_protocol(readname, savename, time_strs, selected_wells, savedir,
     return extract_df
 
 
-def run_qc_for_protocol(readname, savename, time_strs, data_path, wells):
+def run_staircase_qc(readname, savename, time_strs, data_path, wells):
     """
-    Runs a QC procedure for a single protocol, on the selected wells.
+    Runs a QC procedure for a single staircase run, on the selected wells.
 
     Assumes:
     - time_strs has length 2, corresponding to a before- and after-drug trace
@@ -830,7 +872,6 @@ def run_qc_for_protocol(readname, savename, time_strs, data_path, wells):
     - Traces are drug subtracted
     - No capacitative spike filtering
 
-
     @param readname The protocol name, without the time part
     @param savename The shorter name for the protocol
     @param time_strs The time part of the protocol names, must have length 2
@@ -935,7 +976,7 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
     staircase with the second sweep of the second staircase.
 
     Also:
-    - Creates a directory ``output_path/3-qc3-bookend``
+    - Creates a directory ``output_path/qc3-bookend``
     - Creates plots comparing staircase 1, sweep 1, with staircase 2, sweep 2.
 
     TODO: This method repeats lots of steps, unneccesarily:
@@ -957,8 +998,8 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
 
     filepath_first_before = os.path.join(data_path, f'{readname}_{time_strs[0]}')
     filepath_last_before = os.path.join(data_path, f'{readname}_{time_strs[1]}')
-    json_file_first_before = f"{readname}_{time_strs[0]}"
-    json_file_last_before = f"{readname}_{time_strs[1]}"
+    json_file_first_before = f'{readname}_{time_strs[0]}'
+    json_file_last_before = f'{readname}_{time_strs[1]}'
 
     #  Each Trace object contains two sweeps
     first_before_trace = Trace(filepath_first_before, json_file_first_before)
@@ -973,8 +1014,8 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
 
     filepath_first_after = os.path.join(data_path, f'{readname}_{time_strs[2]}')
     filepath_last_after = os.path.join(data_path, f'{readname}_{time_strs[3]}')
-    json_file_first_after = f"{readname}_{time_strs[2]}"
-    json_file_last_after = f"{readname}_{time_strs[3]}"
+    json_file_first_after = f'{readname}_{time_strs[2]}'
+    json_file_last_after = f'{readname}_{time_strs[3]}'
 
     first_after_trace = Trace(filepath_first_after, json_file_first_after)
     last_after_trace = Trace(filepath_last_after, json_file_last_after)
@@ -1003,7 +1044,7 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
                     voltage=voltage)
 
     # Update output path
-    output_path = os.path.join(output_path, '3-qc3-bookend')
+    output_path = os.path.join(output_path, 'qc3-bookend')
     os.makedirs(output_path, exist_ok=True)
 
     # Create figure - will be reused
@@ -1045,12 +1086,24 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
 def get_time_constant_of_first_decay(
         trace, times, protocol_desc, output_path, figure_size):
     """
-    Fits a XXX exponential to the -120mV step following the 40mV step at the
-    start of most protocols.
-    ???
+    Fits a biexponential and single exponential curve to the -120mV step
+    following the 40mV step at the start of a protocol, returning time
+    constants estimated both ways.
+
+    @param trace The current
+    @param times The associated times
+    @param protocol_desc A list of protocol steps and ramps, as returned by
+        :meth:`voltage_protocol.get_all_sections()`
+    @param output_path A filename for a figure. Will store 3 others with
+        similar names
+    @param figure_size A matplotlib figsize argument
+
+    Returns a tuple ``(bi_taus), tau, peak_current`` where ``bi_taus`` contains
+    two time constants determined from a biexponential fit, where ``tau`` is a
+    single time constant from the single exponential fit, and where
+    ``peak_current`` is the peak (most negative) current. If any time constants
+    could not be determined, ``np.nan`` is returned.
     """
-    os.makedirs(os.path.dirname(output_path), exist_ok=True)
-
     first_120mV_step_index = [
         i for i, line in enumerate(protocol_desc) if line[2] == 40][0] + 1
 
@@ -1123,73 +1176,66 @@ def fit2(x):
     if not res2:
         logging.warning(f'finding 120mv decay timeconstant failed: {res}')
 
+    #
     # Create a plot
+    #
     fig = plt.figure(figsize=figure_size, constrained_layout=True)
     axs = fig.subplots(2)
-
     for ax in axs:
         ax.spines[['top', 'right']].set_visible(False)
         ax.set_ylabel(r'$I_\mathrm{obs}$ (pA)')
-
-    axs[-1].set_xlabel(r'$t$ (ms)')
-
     protocol_ax, fit_ax = axs
+
     protocol_ax.set_title('a', fontweight='bold', loc='left')
+    protocol_ax.plot(times, trace)
+    protocol_ax.axvspan(peak_time, tend - 50, alpha=.5, color='grey')
+
     fit_ax.set_title('b', fontweight='bold', loc='left')
+    fit_ax.set_xlabel(r'$t$ (ms)')
     fit_ax.plot(peak_time, peak_current, marker='x', color='red')
 
     a, b, c, d = res1.x
-
     if d < b:
         b, d = d, b
-
     e, f = res2.x
-
     fit_ax.plot(times[indices], trace[indices], color='grey',
                 alpha=.5)
     fit_ax.plot(times[indices], c * np.exp((-1.0/d) * (times[indices] - peak_time))
                 + a * np.exp(-(1.0/b) * (times[indices] - peak_time)),
                 color='red', linestyle='--')
 
-    res_string = r'$\tau_{1} = ' f"{d:.1f}" r'\mathrm{ms}'\
-        r'\; \tau_{2} = ' f"{b:.1f}" r'\mathrm{ms}$'
-
-    fit_ax.annotate(res_string, xy=(0.5, 0.05), xycoords='axes fraction')
-
-    protocol_ax.plot(times, trace)
-    protocol_ax.axvspan(peak_time, tend - 50, alpha=.5, color='grey')
+    fit_ax.annotate(
+        rf'$\tau_1$ = {d:.1f} ms; $\tau_2$ = {b:.1f} ms',
+        xy=(0.5, 0.05), xycoords='axes fraction')
 
+    # Store
+    dirname, filename = os.path.split(output_path)
+    os.makedirs(dirname, exist_ok=True)
     fig.savefig(output_path)
-    fit_ax.set_yscale('symlog')
 
-    dirname, filename = os.path.split(output_path)
-    filename = 'log10_' + filename
-    fig.savefig(os.path.join(dirname, filename))
+    # Store again, now with symmetric log scaling on the y-axis
+    fit_ax.set_yscale('symlog')
+    fig.savefig(os.path.join(dirname, 'log10_' + filename))
 
+    # Remove biexponential and store again
     fit_ax.cla()
+    fit_ax.plot(times[indices], trace[indices], color='grey', alpha=.5)
+    fit_ax.plot(
+        times[indices], e * np.exp((-1.0/f) * (times[indices] - peak_time)),
+        color='red', linestyle='--')
+    fit_ax.annotate(
+        rf'$\tau$ = {f:.1f} ms', xy=(0.5, 0.05), xycoords='axes fraction')
 
-    dirname, filename = os.path.split(output_path)
-    filename = 'single_exp_' + filename
-    output_path = os.path.join(dirname, filename)
-
-    fit_ax.plot(times[indices], trace[indices], color='grey',
-                alpha=.5)
-    fit_ax.plot(times[indices], e * np.exp((-1.0/f) * (times[indices] - peak_time)),
-                color='red', linestyle='--')
+    fig.savefig(os.path.join(dirname, 'single_exp_' + filename))
 
-    res_string = r'$\tau = ' f"{f:.1f}" r'\mathrm{ms}$'
-
-    fit_ax.annotate(res_string, xy=(0.5, 0.05), xycoords='axes fraction')
-    fig.savefig(output_path)
-
-    dirname, filename = os.path.split(output_path)
-    filename = 'log10_' + filename
+    # And once again with symmetric log scaling
     fit_ax.set_yscale('symlog')
-    fig.savefig(os.path.join(dirname, filename))
-
+    fig.savefig(os.path.join(dirname, 'log10_single_exp_' + filename))
     plt.close(fig)
 
-    return (d, b), f, peak_current if res1 else (np.nan, np.nan), np.nan, peak_current
+    params1 = (d, b) if res1 else (np.nan, np.nan)
+    param2 = f if res2 else np.nan
+    return params1, param2, peak_current
 
 
 if __name__ == '__main__':  # pragma: no cover

From 0c99933fc58e4bdd732585fb8924c20d7f233fd9 Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Wed, 7 Jan 2026 22:17:19 +0000
Subject: [PATCH 7/9] Tidying run_herg_qc script

---
 pcpostprocess/scripts/run_herg_qc.py | 270 ++++++++++++++-------------
 1 file changed, 142 insertions(+), 128 deletions(-)

diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index eaadd8b..0584954 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -111,38 +111,52 @@ def run_from_command_line():  # pragma: no cover
     run(
         args.data_directory,
         args.output_dir,
-        export_config.D2S_QC,
+        export_config.saveID,
+        staircase_protocols=export_config.D2S_QC,
+        additional_protocols=export_config.D2S,
         wells=args.wells,
         write_traces=args.output_traces,
-        write_failed_traces=args.export_failed,
-        write_map=export_config.D2S,
+        include_failed_traces=args.export_failed,
         reversal_potential=args.Erev,
         reversal_spread_threshold=args.reversal_spread_threshold,
         max_processes=args.no_cpus,
         figure_size=args.figsize,
-        save_id=export_config.saveID,
     )
 
 
-def run(data_path, output_path, qc_map, wells=None,
-        write_traces=False, write_failed_traces=False, write_map={},
+def run(data_path, output_path, save_id, staircase_protocols,
+        additional_protocols={}, wells=None,
+        write_traces=False, include_failed_traces=False,
         reversal_potential=-90, reversal_spread_threshold=10,
-        max_processes=1, figure_size=None, save_id=None):
+        max_processes=1, figure_size=None):
     """
-    Imports traces and runs QC+.
-
-    Makes the following assumptions:
+    Imports traces and runs several layers of quality control (staircase QC,
+    followed by secondary QC).
 
     This proceeds with the following steps:
 
     1. All wells (or those selected with ``wells``) are run through staircase
-       QC, for every protocol in ``qc_map``. These protocols are expected to be
-       run either 2 times (before all other protocols, with and without IKr
-       blocker) or 4 times (before _and_ after all other protocols). This test
-       is performed by :meth:`run_staircase_qc`.
+       QC, for every protocol in ``staircase_protocols``. These protocols are
+       expected to be run either 2 times (before all other protocols, with and
+       without IKr blocker) or 4 times (before _and_ after all other
+       protocols). Timestamps will be used to determine the order of the
+       sweeps.
     2. The list of wells that pass staircase QC are then stored in a file
        ``selected-{save_id}-txt``.
-    3. For all protocols in either ``qc_map`` or
+    3. For all protocols in ``staircase_protocols`` and
+       ``additional_protocols``, a secondary QC layer is run. This is similar
+       to a stripped-down version of the staircase QC. If ``write_traces=True``
+       this step includes storing before, after, and subtracted traces as CSV.
+       By default, this step is run only for wells that passed staircase QC,
+       but it can be run for all wells by setting
+       ``include_failed_traces=True``.
+    4. Another staircase QC measure, "QC3 bookend" is run (on all wells, not
+       just those that passed previous QC). **This should probably be included
+       in step 1, and affect the selected wells**.
+    5. A file "chrono.txt" is created with the order that protocols were run in
+    6. Multi-protocol QC is performed on all wells that passed staircase QC (or
+       all wells, if ``include_failed_traces=True``.
+    7.
 
 
 
@@ -150,27 +164,27 @@ def run(data_path, output_path, qc_map, wells=None,
 
     @param data_path The path to read data from
     @param output_path The path to write output to
-    @param qc_map A dictionary mapping staircase protocol names to output
-           names. All protocols in this dictionary will be used for staircase
-           quality control and export.
+    @param save_id An "id" string used in e.g. filenames of created CSVs.
+    @param staircase_protocols The protocols to run staircase QC on, specified
+           as a dictionary mapping MapDataControl384 names to shorter names to
+           use in the output. All protocols in this dictionary will be used for
+           staircase QC, secondary QC and export of traces.
+    @param additional_protocols The non-staircase protocols to run secondary QC
+           on and export traces for, specified as a dictionary mapping
+           MapDataControl384 names to shorter names to use in the output.
     @param wells A list of strings indicating the wells to use, or ``None`` for
            all wells.
     @param write_traces Set to ``True`` to write (raw and processed) traces to
            the output directory in CSV format.
-    @param write_failed_traces Set to ``True`` to write traces for wells
-           failing quality control.  Ignored if ``write_traces=False`.
-    @param write_map A dictionary like ``qc_map``, but specifying non-staircase
-           protocols to run generic QC on and export. Ignored if
-           ``write_traces=False`.
+    @param include_failed_traces Set to ``True`` to perform QC and export even
+           on wells failing staircase quality control.
     @param reversal_potential The calculated reversal potential, in mV.
     @param reversal_spread_threshold The maximum range of reversal potential
-           values in all tested protocols (``qc_map`` and ``write_map`` if
-           ``write_traces=True``).
+           values allowed in secondary QC, in mV.
     @param max_processes The maximum number of processes to run simultaneously
     @param figure_size An optional tuple specifying the size of figures to
            create
 
-    @param save_id Used in some outputs, e.g. as part of CSV names
 
     """
     # TODO reversal_spread_threshold should be specified the same way as all
@@ -179,7 +193,7 @@ def run(data_path, output_path, qc_map, wells=None,
     # Create output path if necessary, and write info file
     output_path = setup_output_directory(output_path)
 
-    # TODO Remove protocol selection here: this is done via the export file!
+    # TODO Remove protocol selection here: this is done via the export file
     #      Only protocols listed there are accepted
 
     # Select wells to use
@@ -215,27 +229,21 @@ def run(data_path, output_path, qc_map, wells=None,
     # { protocol_name: [time1, time2, time3, ...] }
     # such that protocol_name_time is a directory
 
-    # TODO: Replace this by looping over qc_map and write_map?
+    # TODO: Replace this by looping over all_protocols?
+    all_protocols = staircase_protocols | additional_protocols
     res_dict = {}
     for dirname in os.listdir(data_path):
         dirname = os.path.basename(dirname)
         match = protocols_regex.match(dirname)
-
         if match is None:
             continue
-
         protocol_name = match.group(1)
-
-        if not (protocol_name in qc_map or protocol_name in write_map):
-            print(f'Skipping {protocol_name}')
+        if protocol_name not in all_protocols:
+            print(f'Skipping unknown protocol {protocol_name}')
             continue
-
-        time = match.group(2)
-
         if protocol_name not in res_dict:
             res_dict[protocol_name] = []
-
-        res_dict[protocol_name].append(time)
+        res_dict[protocol_name].append(match.group(2))
 
     #TEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMP
 
@@ -262,9 +270,9 @@ def pront(*args, t=None):
     # Select QC protocols and times
     readnames, savenames, times_list = [], [], []
     for protocol in res_dict:
-        if protocol not in qc_map:
+        if protocol not in staircase_protocols:
             continue
-        savename = qc_map[protocol]
+        savename = staircase_protocols[protocol]
         times = sorted(res_dict[protocol])
 
         if len(times) == 2:
@@ -278,8 +286,8 @@ def pront(*args, t=None):
             times_list.append([times[0], times[2]])
 
             # Make seperate savename for protocol repeat
-            savename = qc_map[protocol] + '_2'
-            assert savename not in write_map.values()
+            savename = staircase_protocols[protocol] + '_2'
+            assert savename not in additional_protocols.values()
             savenames.append(savename)
             times_list.append([times[1], times[3]])
             readnames.append(protocol)
@@ -339,16 +347,15 @@ def pront(*args, t=None):
 
     #
     # Now go over _all_ protocols, including the QC protocols (again!), and
-    # call run_generic_qc() on them
+    # call run_secondary_qc() on them
     #
-    combined_dict = qc_map | write_map
 
     # Export all protocols
     savenames, readnames, times_list = [], [], []
     for protocol in res_dict:
 
         # Sort into chronological order
-        savename = combined_dict[protocol]
+        savename = all_protocols[protocol]
         readnames.append(protocol)
         times = sorted(res_dict[protocol])
 
@@ -361,8 +368,8 @@ def pront(*args, t=None):
             times_list.append(times[::2])
 
             # Make seperate savename for protocol repeat
-            savename = combined_dict[protocol] + '_2'
-            assert savename not in combined_dict.values()
+            savename = all_protocols[protocol] + '_2'
+            assert savename not in all_protocols.values()
             savenames.append(savename)
             readnames.append(protocol)
             times_list.append(times[1::2])
@@ -370,20 +377,20 @@ def pront(*args, t=None):
         else:
             raise Exception('Expecting QC protocol run 2 or 4 times')
 
-    wells_to_export = wells if write_failed_traces else selection
+    wells_to_export = wells if include_failed_traces else selection
     logging.info(f'exporting wells {wells_to_export}')
     m = len(readnames)
     n = min(max_processes, m)
     args = zip(readnames, savenames, times_list, [wells_to_export] * m,
                [output_path] * m, [data_path] * m, [figure_size] * m,
-               [reversal_potential] * m, [save_id] * m)
-    dfs = starmap(n, run_generic_qc, args)
+               [reversal_potential] * m, [save_id] * m, [write_traces] * m)
+    dfs = starmap(n, run_secondary_qc, args)
     if not dfs:
         raise Exception('No data exported')
 
     #
     # At this point, dfs is a list containing one dataframe _per protocol_. If
-    # write_failed_traces is set, each dataframe contains information on all
+    # include_failed_traces is set, each dataframe contains information on all
     # wells, if not, each contains information only on wells passing
     # staircase QC.
     #
@@ -393,8 +400,8 @@ def pront(*args, t=None):
 
     #
     # At this point, extract_df is a big dataframe with information about all
-    # protocols for all wells (write_failed_traces=True) or just the wells
-    # passing staircase QC (write_failed_traces=False).
+    # protocols for all wells (include_failed_traces=True) or just the wells
+    # passing staircase QC (include_failed_traces=False).
     # It inludes new calculated quantities such as ``Erev`` and ``R_leftover``.
     #
 
@@ -405,7 +412,11 @@ def pront(*args, t=None):
     #
     # This is run on all wells, regardless of staircase QC.
     #
-    protocol, savename = list(qc_map.items())[0]
+    # TODO: This doesn't require the export and is staircase only, so should
+    # probably happen earlier in the sequence and affect the well selection for
+    # secondary QC.
+    #
+    protocol, savename = list(staircase_protocols.items())[0]
     times = sorted(res_dict[protocol])
     if len(times) == 4:
         # If we have staircase-other-staircase before & after drug
@@ -423,9 +434,21 @@ def pront(*args, t=None):
     del(qc3_bookend_dict)
 
     #
+    # Write a file chrono.txt containing the order that protocols were run in
     #
-    #
+    chrono = {times[0]: prot for prot, times in zip(savenames, times_list)}
+    with open(os.path.join(output_path, 'chrono.txt'), 'w') as fout:
+        fout.write('\n'.join(chrono[key] for key in sorted(chrono)) + '\n')
 
+    #
+    # Perform multi-protocol QC per well.
+    # - Check that QC1, QC4, QC6 and QC_Erev pass for all protocols.
+    # - Check that QC3 bookend passes for "all" staircase protocols (typically
+    #   this is just a single check).
+    # - Check that the range of Erev values is below a threshold
+    # - Check that all staircase QCs passed
+    # - Check that QC_R_leftover passed for all staircase protocols
+    #
     erev_spreads = {}
     qc_erev_spread = {}
     passed_qc_dict = {}
@@ -437,40 +460,38 @@ def pront(*args, t=None):
         sub_qc_df = qc_df[qc_df.well == well]
 
         # Check if generic QC passed for all protocols for this well
-        passed_QC_Erev_all = np.all(sub_df['QC.Erev'].values)
         passed_QC1_all = np.all(sub_df.QC1.values)
         passed_QC4_all = np.all(sub_df.QC4.values)
         passed_QC6_all = np.all(sub_df.QC6.values)
-        passed_qc3_bookend = np.all(sub_qc_df['qc3.bookend'].values)
         passed_QC_Erev_all = np.all(sub_df['QC.Erev'].values)
+        passed_qc3_bookend = np.all(sub_qc_df['qc3.bookend'].values)
         logging.info(f'passed_QC1_all {passed_QC1_all}')
         logging.info(f'passed_QC4_all {passed_QC4_all}')
         logging.info(f'passed_QC6_all {passed_QC6_all}')
-        logging.info(f'passed_QC3_bookend_all {passed_qc3_bookend}')
         logging.info(f'passed_QC_Erev_all {passed_QC_Erev_all}')
-
+        logging.info(f'passed_QC3_bookend_all {passed_qc3_bookend}')
 
         # Perform "QC_Erev_spread": this passes if the range of E_rev values in
         # all protocols is below a threshold
         # QC Erev spread: check spread in reversal potential isn't too large
         E_revs = sub_df['E_rev'].values.flatten().astype(np.float64)
         E_rev_spread = E_revs.max() - E_revs.min()
+        del(E_revs)
         passed_QC_Erev_spread = E_rev_spread <= reversal_spread_threshold
         logging.info(f'passed_QC_Erev_spread {passed_QC_Erev_spread}')
-        del(E_revs)
         erev_spreads[well] = E_rev_spread
         qc_erev_spread[well] = passed_QC_Erev_spread
 
-        # R_leftover only considered for protocols used for QC (i.e. staircase protocols)
-        passed_QC_R_leftover = np.all(sub_df[sub_df.protocol.isin(qc_map.values())]
-                                      ['QC.R_leftover'].values)
-
-        logging.info(f"passed_QC_R_leftover {passed_QC_R_leftover}")
-
+        # R_leftover only considered for protocols used for QC (staircases)
+        passed_QC_R_leftover = np.all(
+            sub_df[sub_df.protocol.isin(staircase_protocols.values())]['QC.R_leftover'].values)
+        logging.info(f'passed_QC_R_leftover {passed_QC_R_leftover}')
 
+        # All staircase protocol QCs
         passed_staircase_QC = np.all(sub_df['selected'].values)
 
-        passed_qc = (
+        # Combine all into a final verdict
+        passed_qc_dict[well] = (
             passed_staircase_QC
             and passed_QC1_all
             and passed_QC4_all
@@ -480,53 +501,43 @@ def pront(*args, t=None):
             and passed_QC_Erev_spread
         )
 
-        passed_qc_dict[well] = passed_qc
-
-    pront(extract_df, t='extract_df, pre')
     extract_df['passed QC'] = [passed_qc_dict[well] for well in extract_df.well]
     extract_df['QC.Erev.spread'] = [qc_erev_spread[well] for well in extract_df.well]
     extract_df['Erev_spread'] = [erev_spreads[well] for well in extract_df.well]
-    pront(extract_df, t='extract_df, post')
-    sys.exit(1)
 
-    chrono_dict = {times[0]: prot for prot, times in zip(savenames, times_list)}
+    #
+    # At this point, extract_df has 3 new columns.
+    # Confusingly, it has the old column "selected" indicating whether or not
+    # staircase QC passed, and a new column "passed QC" indicating final
+    # selection.
+    #
 
-    pront('CHRONO', os.path.join(output_path, 'chrono.txt'))
-    with open(os.path.join(output_path, 'chrono.txt'), 'w') as fout:
-        for key in sorted(chrono_dict):
-            val = chrono_dict[key]
-            #  Output order of protocols
-            fout.write(val)
-            fout.write('\n')
-
-    #  Update qc_df
-    update_cols = []
+    #
+    # Copy true/false assessments from extract_df into qc_df
+    #
+    new_cols = []
     for index, vals in qc_df.iterrows():
-        append_dict = {}
+        sub_df = extract_df[(extract_df.well == vals['well'])]
+        row = {}
+        row['QC.Erev.all_protocols'] = np.all(sub_df['QC.Erev'])
+        row['QC.Erev.spread'] = np.all(sub_df['QC.Erev.spread'])
+        row['QC1.all_protocols'] = np.all(sub_df['QC1'])
+        row['QC4.all_protocols'] = np.all(sub_df['QC4'])
+        row['QC6.all_protocols'] = np.all(sub_df['QC6'])
+        new_cols.append(row)
+    for key in new_cols[0]:
+        qc_df[key] = [row[key] for row in new_cols]
 
-        well = vals['well']
-
-        sub_df = extract_df[(extract_df.well == well)]
-
-        append_dict['QC.Erev.all_protocols'] =\
-            np.all(sub_df['QC.Erev'])
-
-        append_dict['QC.Erev.spread'] =\
-            np.all(sub_df['QC.Erev.spread'])
+    #
+    # At this point, extract_df contains mostly numerical values, and a few
+    # True/False results, while qc_df contains pass/fails.
+    #
 
-        append_dict['QC1.all_protocols'] =\
-            np.all(sub_df['QC1'])
+    sys.exit(1)
 
-        append_dict['QC4.all_protocols'] =\
-            np.all(sub_df['QC4'])
 
-        append_dict['QC6.all_protocols'] =\
-            np.all(sub_df['QC6'])
 
-        update_cols.append(append_dict)
 
-    for key in append_dict:
-        qc_df[key] = [row[key] for row in update_cols]
 
     qc_styled_df = create_qc_table(qc_df)
     logging.info(qc_styled_df)
@@ -623,16 +634,16 @@ def agg_func(x):
     return ret_df
 
 
-def run_generic_qc(readname, savename, time_strs, selected_wells, savedir,
-                   data_path, figure_size, reversal_potential, save_id):
+def run_secondary_qc(readname, savename, time_strs, selected_wells, savedir,
+                   data_path, figure_size, reversal_potential, save_id,
+                   write_traces):
     """
-    Performs QC on a protocol (staircase or other), and extracts and stores the
-    data.
+    Performs QC on a protocol (staircase or other), and exports the traces.
 
     Loads the data (again!) and
     - leak-subtracts it using the leak step
-    - stores before, after, and subtracted traces in ``traces`` (for all
-      sweeps), along with time and voltage traces
+    - if ``write_traces`` is True``, stores before, after, and subtracted
+      traces in ``traces`` (for all sweeps), along with time and voltage traces
     - performs QC (1, 4, 6, Erev, R_leftover) using the meta data, leak step,
       and up-down step after the leak step
     - creates and returns a data frame, with entries detailed below
@@ -710,11 +721,12 @@ def run_generic_qc(readname, savename, time_strs, selected_wells, savedir,
 
     # Store voltages and times, using 2 different libraries...
     # TODO: REPLACE WITH SINGLE TIMES, SINGLE VOLTAGE
-    voltage_df = pd.DataFrame(np.vstack((times, voltages)).T,
-                              columns=['time', 'voltage'])
-    voltage_df.to_csv(os.path.join(
-        trace_dir, f'{save_id}-{savename}-voltages.csv'))
-    #write_csv(times, trace_dir, f'{save_id}-{savename}-times.csv')
+    if write_traces:
+        voltage_df = pd.DataFrame(
+            np.vstack((times, voltages)).T, columns=['time', 'voltage'])
+        voltage_df.to_csv(os.path.join(
+            trace_dir, f'{save_id}-{savename}-voltages.csv'))
+        #write_csv(times, trace_dir, f'{save_id}-{savename}-times.csv')
 
     qc_before = before_trace.get_onboard_QC_values()
     qc_after = after_trace.get_onboard_QC_values()
@@ -723,16 +735,17 @@ def run_generic_qc(readname, savename, time_strs, selected_wells, savedir,
     after_data = after_trace.get_trace_sweeps()
 
     # Save before and after drug traces as .csv
-    for well in selected_wells:
-        for sweep in range(nsweeps):
-            write_csv(
-                before_data[well][sweep], trace_dir,
-                f'{save_id}-{savename}-{well}-before-sweep{sweep}.csv',
-                header='current')
-            write_csv(
-                after_data[well][sweep], trace_dir,
-                f'{save_id}-{savename}-{well}-after-sweep{sweep}.csv',
-                header='current')
+    if write_traces:
+        for well in selected_wells:
+            for sweep in range(nsweeps):
+                write_csv(
+                    before_data[well][sweep], trace_dir,
+                    f'{save_id}-{savename}-{well}-before-sweep{sweep}.csv',
+                    header='current')
+                write_csv(
+                    after_data[well][sweep], trace_dir,
+                    f'{save_id}-{savename}-{well}-after-sweep{sweep}.csv',
+                    header='current')
 
     # Reusable figure to plot subtraction
     fig = plt.figure(figsize=figure_size, layout='constrained')
@@ -761,11 +774,12 @@ def run_generic_qc(readname, savename, time_strs, selected_wells, savedir,
             before_corrected = before_data[well][sweep] - before_leak
             after_corrected = after_data[well][sweep] - after_leak
 
-            # Store drug-subtracted trace
+            # Create and store drug-subtracted trace
             subtracted = before_corrected - after_corrected
-            write_csv(
-                subtracted, trace_dir,
-                f'{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv')
+            if write_traces:
+                write_csv(
+                    subtracted, trace_dir,
+                    f'{save_id}-{savename}-{well}-sweep{sweep}-subtracted.csv')
 
             # Create subtraction plot
             fig = plt.figure(figsize=figure_size, layout='constrained')
@@ -1053,7 +1067,7 @@ def qc3_bookend(readname, savename, time_strs, wells, output_path,
 
     #  Iterate over all wells, perform qc3-bookend, plot and store
     res_dict = {}
-    for well in np.array(wells).flatten():
+    for well in wells:
         # First staircase, before drug and after drug, first sweep
         before_trace_first = get_leak_corrected(
             first_before_current_dict[well][0], voltage, times, *ramp_bounds)

From d560541bd9d3ac7f2e2dc7d1dd7e20c0fddf6272 Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Thu, 8 Jan 2026 08:07:04 +0000
Subject: [PATCH 8/9] Tidied run_herg_qc script, for a decent bit

---
 pcpostprocess/scripts/run_herg_qc.py | 57 +++++++++-------------------
 1 file changed, 18 insertions(+), 39 deletions(-)

diff --git a/pcpostprocess/scripts/run_herg_qc.py b/pcpostprocess/scripts/run_herg_qc.py
index 0584954..b03d65e 100644
--- a/pcpostprocess/scripts/run_herg_qc.py
+++ b/pcpostprocess/scripts/run_herg_qc.py
@@ -156,11 +156,8 @@ def run(data_path, output_path, save_id, staircase_protocols,
     5. A file "chrono.txt" is created with the order that protocols were run in
     6. Multi-protocol QC is performed on all wells that passed staircase QC (or
        all wells, if ``include_failed_traces=True``.
-    7.
-
-
-
-
+    7. QC results are stored in CSV, JSON, and tex, plus a file
+       ``passed_wells.txt`` listing the wells that passed final QC.
 
     @param data_path The path to read data from
     @param output_path The path to write output to
@@ -185,7 +182,6 @@ def run(data_path, output_path, save_id, staircase_protocols,
     @param figure_size An optional tuple specifying the size of figures to
            create
 
-
     """
     # TODO reversal_spread_threshold should be specified the same way as all
     #      other QC thresholds & parameters.
@@ -245,23 +241,10 @@ def run(data_path, output_path, save_id, staircase_protocols,
             res_dict[protocol_name] = []
         res_dict[protocol_name].append(match.group(2))
 
-    #TEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMP
-
-    def pront(*args, t=None):
-        if t is None:
-            print('********')
-        else:
-            print(f'******** {t} ********')
-        for arg in args:
-            print(arg)
-
-    pd.set_option('display.max_columns', None)
-    pd.set_option('display.max_rows', None)
-
-    #TEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMPTEMP
-
+    #
     # At this point, despite its name, res_dict is not a dictionary of results,
     # but a map of QC protocol names onto lists of times (see comment above)
+    #
 
     #
     # Prepare arguments to call `run_staircase_qc`
@@ -533,30 +516,22 @@ def pront(*args, t=None):
     # True/False results, while qc_df contains pass/fails.
     #
 
-    sys.exit(1)
-
-
-
-
-
-    qc_styled_df = create_qc_table(qc_df)
-    logging.info(qc_styled_df)
-    qc_styled_df.to_latex(os.path.join(output_path, 'qc_table.tex'))
-
-    # Save in csv format
+    # Store decisions in tex, csv, and JSON
+    create_qc_table(qc_df).to_latex(os.path.join(output_path, 'qc_table.tex'))
     qc_df.to_csv(os.path.join(output_path, f'QC-{save_id}.csv'))
-
-    # Write data to JSON file
     qc_df.to_json(os.path.join(output_path, f'QC-{save_id}.json'),
                   orient='records')
 
-    #  Load only QC vals. TODO use a new variabile name to avoid confusion
-    qc_vals_df = extract_df[['well', 'sweep', 'protocol', 'Rseal', 'Cm', 'Rseries']].copy()
+    # Store subtraction / extract / secondary QC numbers as CSV
+    extract_df.to_csv(os.path.join(output_path, 'subtraction_qc.csv'))
+
+    #  Store QC numbers as CSV
+    fields = ['well', 'sweep', 'protocol', 'Rseal', 'Cm', 'Rseries']
+    qc_vals_df = extract_df[fields].copy()
     qc_vals_df['drug'] = 'before'
     qc_vals_df.to_csv(os.path.join(output_path, 'qc_vals_df.csv'))
 
-    extract_df.to_csv(os.path.join(output_path, 'subtraction_qc.csv'))
-
+    # Store passed wells
     with open(os.path.join(output_path, 'passed_wells.txt'), 'w') as fout:
         for well, passed in passed_qc_dict.items():
             if passed:
@@ -566,7 +541,11 @@ def pront(*args, t=None):
 
 def create_qc_table(qc_df):
     """
-    ???
+    Create a tex table based on a dataframe containing true/false for various
+    decision criteria.
+
+    The table is returned as another dataframe, which can be made into tex
+    using ``.to_latex(path)``.
     """
 
     if len(qc_df.index) == 0:

From e6c7b4407b89604a3164026903c73f3155c6121d Mon Sep 17 00:00:00 2001
From: Michael Clerx <michael.clerx@nottingham.ac.uk>
Date: Thu, 8 Jan 2026 13:38:27 +0000
Subject: [PATCH 9/9] Updated README

---
 README.md | 146 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 75 insertions(+), 71 deletions(-)

diff --git a/README.md b/README.md
index 62f4d7d..0937ad2 100644
--- a/README.md
+++ b/README.md
@@ -46,102 +46,106 @@ python3 -m unittest
 
 ## Usage
 
-### Classic case
+We are working to make pcpostprocess a reusable tool for different types of data.
+At the moment, it is geared towards the specific use case detailed below.
 
-- Syncropatch 384 ("DataControl 384" output)
-- A full repeat of all protocols with an IKr blocker, e.g. E-4031
-- A staircase protocol, run 4 times:
+### Experimental data
+
+- Syncropatch 384 ("DataControl 384") data
+- The full set of protocols is repeated after addition of a strong IKr blocker, e.g. E-4031
+- Each set of protocols starts and ends with a "staircase" protocol, leading to a total of 4 staircases:
   - Before and after all other protocols, before drug block
   - Before and after all other protocols, after drug block
-- Additional protocols ran in between the staircases
-- Leak subtraction by...
 
+So a full set of experiments would go "staircase, other1, other2, ... staircase, drug addition & wash in, staircase, other1, other2, ..., staircase".
 
-Quality control (QC) may be run using the criteria outlined in [Rapid Characterization of hERG Channel Kinetics I](https://doi.org/10.1016/j.bpj.2019.07.029) and [Evaluating the predictive accuracy of ion channel models using data from multiple experimental designs](https://doi.org/10.1101/2024.08.16.608289). 
-These criteria assume the use of the `staircase` protocol for quality control, which should be the first and last protocol performed. 
-We also assume the presence of repeats after the addition of an IKr blocker (such as dofetilide).
+Each protocol
+- Starts with a leak estimation ramp
+- Followed by a step to 40mV and then to -120mV
+- Ends with a reversal potential step
+- Provides an estimate of Rseal, Rseries, and Cm
 
-Prior to performing QC and exporting, an `export_config.py` file should be added to the root of the data directory. 
-This file (see `example_config.py`) contains a Python `dict` (`Q2S_DC`) specifying the filenames of the protocols used for QC, and names they should be outputted with, as well as a Python dict (`D2S`) listing the other protocols and names to be used for their output. 
-Additionally, the `saveID` field specifies the name of the expeirment which appears in the output file names.
+### Leak correction
 
-An example input directory might contain the following subdirectories
-- `staircaseramp (2)_2kHz_15.01.07`, staircase 1 before E-4031
-- `StaircaseInStaircaseramp (2)_2kHz_15.01.51`, non-QC protocol
-- `staircaseramp (2)_2kHz_15.06.53`, staircase 2 before E-4031
-- `staircaseramp (2)_2kHz_15.11.33`, staircase 1 after E-4031
-- `StaircaseInStaircaseramp (2)_2kHz_15.12.17`, non-QC protocol
-- `staircaseramp (2)_2kHz_15.17.19`, staircase 2 after E-4031
+- Leak correction is applied by estimating linear leak from the leak step.
+- Before and after drug traces are both corrected
+- The final signal is leak-corrected-before minus leak-corrected-after
 
-`run_herg_qc` will use the time parts of the staircase names to work out which is which.
-This assumes there are either 2 staircase runs (once before drug, once after), or 4 (as above)
+### Quality control
 
-Example:
-```
-# Configuration for run_herg_qc
+Quality control is based on [Lei et al. (2019) Rapid Characterization of hERG Channel Kinetics I](https://doi.org/10.1016/j.bpj.2019.07.029) and [Shuttleworth et al. (2025) Evaluating the predictive accuracy of ion channel models using data from multiple experimental designs](https://doi.org/10.1098/rsta.2024.0211).
 
-# Save name for this set of data
-saveID = 'EXPERIMENT_NAME'
+First, all four staircase protocols are checked in all wells using the Lei et al. criteria, along with new measures described in Shuttleworth et al.
+Any wells not rejected are then tested using a smaller set of (less protocol specific) criteria detailed in Shuttleworth et al.
+Only wells passing all tests on all protocols are retained.
 
-# Name of protocols to use in QC, mapped onto names used in output
-D2S_QC = {
-    'staircaseramp (2)_2kHz': 'staircaseramp'
-}
+### Writing an export_config.py
 
-# Name of additional protocols to export, but not use in QC (again as a dict)
-D2S = {
-    'StaircaseInStaircaseramp (2)_2kHz': 'sis',
-}
-```
+Prior to performing QC and exporting, an `export_config.py` file should be added to the root of the data directory.
+An example file is available [here](./example_config.py).
+This should contain three entries:
 
-Next, we call `run_herg_qc`:
-```sh
-pcpostprocess run_herg_qc test_data/13112023_MW2_FF -w A01 A02 A03 -o output --output_traces
+1. A `saveID` variable, providing a name to use in exported data. 
+2. A dictionary variable `Q2S_DC` indicating the name of the staircase protocol, and mapping it onto an more user-friendly name used in export.
+3. A dictionary variable `D2S` indicating names of other protocols to export, again mapping onto names for export.
 
-```
+For example, in the test data (see "Getting Started") above, we have six subdirectories:
+- `staircaseramp (2)_2kHz_15.01.07`, staircase run as first protocol before E-4031
+- `StaircaseInStaircaseramp (2)_2kHz_15.01.51`, non-QC protocol
+- `staircaseramp (2)_2kHz_15.06.53`, staircase run as last protocol before E-4031
+- `staircaseramp (2)_2kHz_15.11.33`, staircase run as first protocol after E-4031 addition
+- `StaircaseInStaircaseramp (2)_2kHz_15.12.17`, non-QC protocol
+- `staircaseramp (2)_2kHz_15.17.19`, staircase run as final protocol
 
+Here each directory name is a protocol name followed by a timestamp.
+Corresponding dictionaries could be `Q2S_DC = {'staircaseramp (2)_2kHz': 'staircase'}` (indicating the staircase protocol) and `D2S = {'StaircaseInStaircaseramp (2)_2kHz': 'sis'}`.
+The saveID could be any string, but in our example we use `saveID = '13112023_MW2'`
 
+### Running
 
+To run, we call `run_herg_qc` with:
 ```sh
-$ pcpostprocess run_herg_qc --help
+pcpostprocess run_herg_qc test_data/13112023_MW2_FF -o output --output_traces -w A01 A02 A03
 ```
+Here:
+- `test_data/13112023_MW2_FF` is the path to the test data,
+- `-o output` tells the script to store all results in the folder `output`
+- `--output_traces` tells the script to export the data (instead of running QC only), and
 
+The last part, `-w A01 A02 A03`, tells the script to only check the first three wells.
+This speeds things up for testing, but would be omitted in normal use.
 
-
-
-
-
-### Exporting Summary
-
-The `summarise_herg_export` command produces additionally output after `run_herg_qc` has been run.
+To see the full set of options, use 
 
 ```sh
-$ pcpostprocess summarise_herg_export --help
-
-usage: pcpostprocess summarise_herg_export [-h] [--cpus CPUS]
-                      [--wells WELLS [WELLS ...]] [--output OUTPUT]
-                      [--protocols PROTOCOLS [PROTOCOLS ...]] [-r REVERSAL]
-                      [--experiment_name EXPERIMENT_NAME]
-                      [--figsize FIGSIZE FIGSIZE] [--output_all]
-                      [--log_level LOG_LEVEL]
-                      data_dir qc_estimates_file
-
-positional arguments:
-  data_dir           path to the directory containing the run_herg_qc results
-
-options:
-  -h, --help            show this help message and exit
-  --cpus CPUS, -c CPUS
-  --wells WELLS [WELLS ...], -w WELLS [WELLS ...]   wells to include in the output (default: all)
-  --output OUTPUT, -o OUTPUT     path where the output will be saved
-  --protocols PROTOCOLS [PROTOCOLS ...]  protocols to include (default: all)
-  -r REVERSAL, --reversal REVERSAL   the reversal potential during the experiment
-  --experiment_name EXPERIMENT_NAME
-  --figsize FIGSIZE FIGSIZE
-  --output_all        Flag specifying whether to output all plots (default: false)
-  --log_level LOG_LEVEL
+$ pcpostprocess run_herg_qc --help
 ```
 
+### Interpreting the output
+
+After running the command above on the test data provided, the directory `output` will contain the subdirectories:
+
+- `-120mV time constant` Plots illustrating the fits performed when estimating time constants. These are output but not used in QC.
+- `leak_correction` Plots illustrating the _linear_ leak correction process in all wells passing staircase QC.
+- `qc3-bookend` Plots illustrating the "QC3 bookend" criterion used in Shuttleworth et al.
+- `reversal_plots` Plots illustrating the reversal potential estimation
+- `subtraction_plots` Plots illustrating the drug subtracted trace leak removal.
+- `traces` The exported traces (times, voltages, currents)
+
+and the files:
+
+- `chrono.txt` Lists the order in which protocols were run. For the staircase, the repeat at the end of the sequence is indicated by an added `_2`. 
+- `passed_wells.txt` Lists the wells that **passed all QC**.
+- `pcpostprocess_info.txt` Information on the run that generated this output (date, time, command used etc.)
+- `QC-13112023_MW2.csv` A CSV representation of the pass/fail results for each individual criterion
+- `QC-13112023_MW2.json` A JSON representation of the same
+- `qc_table.tex` A table in Tex format with the above.
+- `qc_vals_df.csv` Numerical values used in the final QC.
+- `selected-13112023_MW2.txt` **Preliminary QC selection** based on staircase protocol 1 (before other protocols) and 2 (after other protocols)
+- `selected-13112023_MW2-staircaseramp.txt` Preliminary selection based on staircase 1
+- `selected-13112023_MW2-staircaseramp_2.txt` Preliminary selection based on staircase 2
+- `subtraction_qc.csv` Numerical values used in the final QC and leak subtraction.
+
 ## Contributing
 
 Although `pcpostprocess` is still early in its development, we have set up guidelines for user contributions in [CONTRIBUTING.md](./CONTRIBUTING.md).