-
Notifications
You must be signed in to change notification settings - Fork 90
Description
In most bigger migrations, we come to a point where some package with lots of children failed spuriously and then takes ages to be retried. For the 3.14 migration, the bot stumbled over numpy (discussion) and despite the fact that conda-forge/numpy-feedstock#363 had long been merged, the blocker still exists since the restart of the 3.14 migration 36h ago.
.
In this particular case, the reason looks to be that the solver checks still fail as before even when the bot does retry things. Additionally, #4650 added some logic to boost feedstock with lots of children (i.e. broadly trying to tackle the same issue).
However, I think there's a bigger win to be had here, i.e. avoid wasteful retries (where nothing has changed), but prioritise retries where they're most impactful. The logic for that would be pretty simple (I'm assuming we have all the timestamps available in the graph and the bot-metadata):
whenever a feedstock has been updated since the last failed bot attempt, always retry that feedstock
Obviously, if there are several feedstocks like this, we can sort them in descending order by number of children and then cap it at some number (e.g. the top 5 previously-failed-but-updated-in-the-meantime feedstocks get retried unconditionally, outside of the bot share accounting, while the rest runs as usual).
The reason should be obvious: if the feedstock has been updated, the reason for the previous bot failure might very well be gone (and in the case of manual fixes to feedstocks because the migration was bottlenecked on them, that might even have been the whole point)