Is your feature request related to a problem? Please describe.
Currently when the fine tuning script crashes a lot of state associated with it is gone and leave some open file descriptors or connections which are not closed, for e.g. runs tracked by Aim which show as running even though the program has exited.
The proposal is to have an exit handler which will run close on these descriptors and even allow to save some state from the system before exiting.
Describe the solution you'd like
Need to look into what helps here, modules like https://docs.python.org/3/library/atexit.html exist but only help for cetain scenarios and not all of them.