Skip to content

Comments

fix: add guard for empty array in g123 when window_size exceeds avail#937

Open
31puneet wants to merge 2 commits intomalariagen:masterfrom
31puneet:fix/g123-empty-array-guard
Open

fix: add guard for empty array in g123 when window_size exceeds avail#937
31puneet wants to merge 2 commits intomalariagen:masterfrom
31puneet:fix/g123-empty-array-guard

Conversation

@31puneet
Copy link
Contributor

Description

Fixes #571
Relates to #695

This PR fixes the intermittent IndexError in G123 tests (test_g123_gwss_...[af1_sim] and test_g123_calibration[af1_sim]).

The Bug

allel.moving_statistic(gt, ..., size=window_size) returns an empty array (0,) when gt.shape[0] < window_size. Downstream code that indexes into this array (x[0], x[-1]) then crashes with:

IndexError: index 0 is out of bounds for axis 0 with size 0

Why It's Intermittent

The test fixture in conftest.py used p_site=np.random.random(), and the test in test_g123.py selects random window_size values between 100-500. When a low p_site is randomly chosen (resulting in few sites) and a large window_size is picked, the number of sites falls below the window size, triggering the crash.


Changes Made

1. g123.py

  • Added a ValueError guard in _g123_gwss (before allel.moving_statistic) that checks if gt.shape[0] < window_size
  • Added the same guard in _g123_calibration (inside the window_sizes loop)

2. conftest.py

  • Changed p_site=np.random.random()p_site=np.random.uniform(0.5, 1.0) in Af1Simulator.init_hap_sites() to ensure test data always generates sufficient haplotype sites

Test Results

Before Fix (no guard, p_site=np.random.random())

Ran test_g123.py -k "af1_sim" in a loop of 20:

  • Run 13/20 FAILEDIndexError on test_g123_gwss_with_phased_sites[af1_sim] and test_g123_calibration[af1_sim]

After Fix (guard added, p_site=np.random.uniform(0.5, 1.0))

Ran test_g123.py -k "af1_sim" in a loop of 50:

  • 50/50 PASSED — zero failures

Also verified related test suites (test_h12, test_h1x, test_fst) with af1_sim — 20/20 passed.


Impact

  • Production code: Users who call g123_gwss() or g123_calibration() with a window_size larger than available sites will now get a clear ValueError instead of a confusing IndexError
  • Tests: The conftest.py change ensures simulated data always has enough haplotype sites, eliminating intermittent test failures
  • No breaking changes

@31puneet 31puneet changed the title fix: add guard for empty array in g123 when window_size exceeds avail… fix: add guard for empty array in g123 when window_size exceeds avail Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FAILED tests/anoph/test_g123.py::test_g123_calibration[af1_sim] - IndexError: index -1 is out of bounds for axis 0 with size 0

1 participant