-
Notifications
You must be signed in to change notification settings - Fork 0
Description
TL;DR Can everyone test their metric on some GPU, most are failing on mine.
Issue
Hi all, I discovered this in my own code but it could affect more people:
In my metric class, Precision, I declare new tensors like this:
true_oh = torch.zeros(y_true.size(0), self.num_classes).scatter_(
1, y_true.unsqueeze(1), 1
)
pred_oh = torch.zeros(y_pred.size(0), self.num_classes).scatter_(
1, y_pred.unsqueeze(1), 1
)
However, this will cause the code to fail on the GPU since these are allocated to CPU memory.
I get the same issue with recall @salomaestro.
For Entropy I get a different GPU error @Seilmast:
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
And for f1 @sot176 I receive the following error:
/CollaborativeCoding/metrics/F1.py", line 161, in returnmetric
self.y_pred = torch.cat(self.y_pred)
^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated
Accuracy seems to work well on both cpu and gpu, well done @hzavadil98.
Suggestion
Either:
- We may pass the device variable to the metric classes
- Avoid allocating memory to new tensors within the metric classes.
Any thoughts?