We need to decide on the shape of `y_pred` and `y_true` that goes into the `forward` function of each metric class. This is also discussed in #54