Skip to content

how to solve the overflow encountered runtimewanring ? #23

@ygren

Description

@ygren

hello,
I use this work to train in voc0712 , A runtime errorwanring producet.it's necessary to fix it.
the output is
I0324 18:50:32.082937 8933 solver.cpp:219] Iteration 0 (0 iter/s, 0.709095s/20 iters), loss = 4.24172 I0324 18:50:32.082983 8933 solver.cpp:238] Train net output #0: accuarcy = 0 I0324 18:50:32.082993 8933 solver.cpp:238] Train net output #1: loss_bbox = 1.67801e-05 (* 1 = 1.67801e-05 loss) I0324 18:50:32.082998 8933 solver.cpp:238] Train net output #2: loss_cls = 3.04193 (* 1 = 3.04193 loss) I0324 18:50:32.083003 8933 solver.cpp:238] Train net output #3: rpn_cls_loss = 0.779677 (* 1 = 0.779677 loss) I0324 18:50:32.083009 8933 solver.cpp:238] Train net output #4: rpn_loss_bbox = 0.493717 (* 1 = 0.493717 loss) I0324 18:50:32.155719 8933 sgd_solver.cpp:105] Iteration 0, lr = 0.001 /home/dmt/FV/py-R-FCN/tools/../lib/fast_rcnn/bbox_transform.py:47: RuntimeWarning: overflow encountered in exp pred_w = np.exp(dw) * widths[:, np.newaxis] /home/dmt/FV/py-R-FCN/tools/../lib/fast_rcnn/bbox_transform.py:48: RuntimeWarning: overflow encountered in exp pred_h = np.exp(dh) * heights[:, np.newaxis] /home/dmt/FV/py-R-FCN/tools/../lib/fast_rcnn/bbox_transform.py:47: RuntimeWarning: overflow encountered in exp pred_w = np.exp(dw) * widths[:, np.newaxis] /home/dmt/FV/py-R-FCN/tools/../lib/fast_rcnn/bbox_transform.py:48: RuntimeWarning: overflow encountered in exp pred_h = np.exp(dh) * heights[:, np.newaxis] I0324 18:50:45.361021 8933 solver.cpp:219] Iteration 20 (1.50624 iter/s, 13.2781s/20 iters), loss = nan I0324 18:50:45.361094 8933 solver.cpp:238] Train net output #0: accuarcy = 0 I0324 18:50:45.361110 8933 solver.cpp:238] Train net output #1: loss_bbox = nan (* 1 = nan loss) I0324 18:50:45.361119 8933 solver.cpp:238] Train net output #2: loss_cls = 87.3365 (* 1 = 87.3365 loss) I0324 18:50:45.361126 8933 solver.cpp:238] Train net output #3: rpn_cls_loss = 0.693147 (* 1 = 0.693147 loss) I0324 18:50:45.361137 8933 solver.cpp:238] Train net output #4: rpn_loss_bbox = 1.85539e+33 (* 1 = 1.85539e+33 loss) I0324 18:50:45.446388 8933 sgd_solver.cpp:105] Iteration 20, lr = 0.001

My environment is as follows:
Ubuntu 16.04LTS
CUDA 9.1
CUDNN 7.1
NVIDIA Tesla P100 GPU.
...
I have try to change the base_lr and 0-base boxes ,it doesn't work.It's no runtime warning with single GPU.
can you help me fix it ?
Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions