Skip to content

GPU memory not release after interrupted the training script #15

@LearnerInGithub

Description

@LearnerInGithub

Hi, @bharatsingh430 , I faced a problem that the GPU memory not released normally after I interrupted the training script, in details saying, I used 2 GPU, like [0,1], while I pressed the Ctrl+C to stop the training script, then I prompt the nvidia-smi to see the GPU usage, found that only GPU 1 was normally released the used memory and GPU 0 still keep the allocated memory, even wait for a long time, the problem still there, so want to ask which reasons may caused such problem? And how could I fixed it? PS: I tried kill the Python process, but it not work. Waiting for your help! Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions