Added micro/macro averaging option to MetricsWrapper #54

hzavadil98 · 2025-02-07T13:08:59Z

In addition to #52 where now logits are passed directly to metrics, I added, as proposed, the micro/macro averaging option to be passed to each metric (except of entropy where I believe it doesn't make sense, right? @Seilmast ). Here is a pretty thorough description of the issue Micro/Macro. This resolves #30 for Johan. Please push your adjusted metrics directly to this Draft PR before we merge them.

…ne argument

…nloading

sot176 · 2025-02-10T10:18:29Z

I pushed my adjusted metric now to this draft PR :)

Seilmast · 2025-02-10T10:41:39Z

Micro/Macro averages doesn't make sense for Shannon Entropy @hzavadil98. I have added in some averaging functionality for batchwise mean or sum of entropy. As long as the option is passed as a argument then it shouldn't make any difference.

…al-split Implementing @salomaestro s changes to the downloading process.

Merging the metrics updates with dataloader updates

c-salomonsen · 2025-02-11T15:40:48Z

Yep, I'm working on mine sporadically. Will push in not too long.

Could anyone clarify to me: the input to the metric, what shapes are y_pred and y_true? Cuz if y_pred are logits sent in, I suppose they are one-hot encoded, but is y_true also that?

Johanmkr · 2025-02-11T15:47:04Z

Yep, I'm working on mine sporadically. Will push in not too long.

Could anyone clarify to me: the input to the metric, what shapes are y_pred and y_true? Cuz if y_pred are logits sent in, I suppose they are one-hot encoded, but is y_true also that?

I do the one-hot-encoding within the metric, but we should maybe decide on this.

Seilmast · 2025-02-11T15:48:53Z

Logits are not one-hot encoded. The shape you should expect is N x C where N is the batchsize and C is the number of classes.
y_true is probably just a Nx1 vector, with each row being the numerical label of the sample. That can be converted to a one-hot matrix should you need to do so.

c-salomonsen · 2025-02-11T15:54:50Z

Logits are not one-hot encoded. The shape you should expect is N x C where N is the batchsize and C is the number of classes.

Thats what one-hot encoding is.

y_true is probably just a Nx1 vector, with each row being the numerical label of the sample. That can be converted to a one-hot matrix should you need to do so.

Okay, good. I've noticed that this ambiguity w.r.t the shape of y_true and y_pred means that some of us expect different input shapes. Thus, I guess we should specify if the sizes you are describing is what we want to use, or if both y_pred and y_true should already have shape ($N \times C$) when sent to the MetricWrapper?

Seilmast · 2025-02-11T16:01:27Z

Maybe we're not refering to the same thing.
A one-hot encoded vector is a spare vector with a single value on the index representing the class.
Logits are not sparse and represents the networks map of the input onto some probability space (-inf to inf before softmax).
The logits should be the pure output of the network. It's the same as you would input into the CrossEntropyLoss function for example.

Yeah, we should specify the sized. I think working with the logits and labels in the form NxC and Nx1 is easier as

You usually have the labels on that form when doing training / interferance
You don't have to touch the logits after the network returns them

It allows us to send the input sample through the network, and just use the raw output on the metric instance.

c-salomonsen · 2025-02-11T16:06:41Z

I see, was mixing with the softmax output.

I think we could go for that 👍🏻

This will hopefully simplify the arguments to each metric slightly.

Note that the precision metric needs a rewording as we use the argument macro_averaging = True/False as input but that one has the argument micro_averaging.

c-salomonsen · 2025-02-11T17:59:43Z

I just pushed my updated metric + tests to this PR. Note that the test will purposefully fail until the Precision metric is updated (see this commit message @Johanmkr)

The tests will likely need to be changed a bit to accomodate the differing input arguments to @Seilmast's entropy metric, but this works for now 😉 See here for test on metricwrapper.

Johanmkr · 2025-02-12T07:57:19Z

I just pushed my updated metric + tests to this PR. Note that the test will purposefully fail until the Precision metric is updated (see this commit message @Johanmkr)

The tests will likely need to be changed a bit to accomodate the differing input arguments to @Seilmast's entropy metric, but this works for now 😉 See here for test on metricwrapper.

I am resolving this now, but I am still confused about the shapes. So the logits will have shape NxC, but should we enforce y_true to have shape Nx1 rather than just N, which it seems to have in the metric wrapper test @salomaestro?

I will try to fix my code in the breaks of the ethics course today.

c-salomonsen · 2025-02-12T09:04:48Z

@Johanmkr as long as your inputs and outputs follow the agreed upon shape you may modify the vectors to your liking. For instance, if you want to use y_true with a shape of N, you could squeeze the dimensions using: y_true.squeeze(), which will do that for you 😉 Maybe we agree upon this shape in our meeting tomorrow?

…y_true. is it ([N,]) or ([N,1])?

Johanmkr · 2025-02-13T09:20:21Z

@Johanmkr as long as your inputs and outputs follow the agreed upon shape you may modify the vectors to your liking. For instance, if you want to use y_true with a shape of N, you could squeeze the dimensions using: y_true.squeeze(), which will do that for you 😉 Maybe we agree upon this shape in our meeting tomorrow?

Yes we can discuss during today's meeting, because it seems to me that the shape we all use for y_true in the tests is in fact (N) and not (N,1), but let's have a look later today :)

Johanmkr · 2025-02-13T10:30:52Z

c-salomonsen · 2025-02-13T10:54:18Z

@Johanmkr git push origin main --force 😆

Johanmkr and others added 5 commits February 6, 2025 11:21

Added UV as package manager

6757135

load_data - changed to accomodate train/val/test split, added test loop

9baa17e

ruffed and isorted

f840af4

fix in mnist0-3

efc78f3

Added micro/macro averaging option to MetricsWrapper and as commandli…

54f3883

…ne argument

hzavadil98 requested review from Johanmkr, c-salomonsen and sot176 February 7, 2025 13:08

c-salomonsen and others added 11 commits February 8, 2025 19:49

Make separate downloader class that handles everything related to dow…

a2606e1

…nloading

downloader handles wheter to download data or not, so remove option

a58e495

Remove downloading logic from USPS dataset

a9e2cad

load_data now splits the data, downloads data and returns all splits

34539b3

Made a whoopsie

6c6f7b5

Add the size thing

20faa24

Actually send the indices, not labels to datasets

d7526bf

Format

bd35ae6

More formatting

0f32064

Adjust test to comply with new functionality

ad15940

added micro/macro to F1

177258b

hzavadil98 and others added 5 commits February 10, 2025 13:44

added MNIST downloader, adjusted minor thinks for the code to run

15c99ea

ruffed, isorted

601caca

Merge pull request #60 from SFI-Visual-Intelligence/christian/train-v…

f2e14c4

…al-split Implementing @salomaestro s changes to the downloading process.

Merge branch 'main' into Jan-dataloader

ba2212e

ruffisorted :'(

b7bffa3

hzavadil98 removed the request for review from sot176 February 10, 2025 15:14

hzavadil98 and others added 3 commits February 10, 2025 16:18

Merge branch 'Jan-dataloader' into Jan-metrics

4071181

Merge pull request #63 from SFI-Visual-Intelligence/Jan-metrics

27f120c

Merging the metrics updates with dataloader updates

preparing for overall test

5d8309b

Johanmkr mentioned this pull request Feb 11, 2025

Metric input shape #68

Closed

c-salomonsen added 5 commits February 11, 2025 18:04

Update recall metric with macro/micro averaging

bf8a09f

Update tests for Recall metric

b9dc34e

Took the liberty to change the F1 metric dimension to fit

2885a30

Modified the MetricWrappers arguments being passed on

08aa876

This will hopefully simplify the arguments to each metric slightly.

Add test for metricwrapper and all metrics

bab6aee

Note that the precision metric needs a rewording as we use the argument macro_averaging = True/False as input but that one has the argument micro_averaging.

Johanmkr added 2 commits February 12, 2025 07:43

updated johan_model to flatten the input in the forward loop

0c16ba1

Updated precision metric with macro_averaging as argument

ba90d89

Johanmkr added 4 commits February 13, 2025 10:06

Updated precision metric and test function, need to discuss shape of …

b9b7158

…y_true. is it ([N,]) or ([N,1])?

updated UV stuff

7c7a80d

ruffedisorted

e7ba8a8

added sklearn to conda environment for github tests

8922263

Fixed bug in JohanModel

97750d8

c-salomonsen mentioned this pull request Feb 13, 2025

Create tests for load_model/metric/data, closes #46 #49

Merged

hzavadil98 marked this pull request as ready for review February 13, 2025 11:21

c-salomonsen approved these changes Feb 13, 2025

View reviewed changes

hzavadil98 merged commit 77b5002 into main Feb 13, 2025
4 checks passed

hzavadil98 deleted the Jan-metrics branch February 13, 2025 12:15

Added micro/macro averaging option to MetricsWrapper #54

Added micro/macro averaging option to MetricsWrapper #54

Uh oh!

Conversation

hzavadil98 commented Feb 7, 2025

Uh oh!

sot176 commented Feb 10, 2025

Uh oh!

Seilmast commented Feb 10, 2025

Uh oh!

c-salomonsen commented Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Johanmkr commented Feb 11, 2025

Uh oh!

Seilmast commented Feb 11, 2025

Uh oh!

c-salomonsen commented Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Seilmast commented Feb 11, 2025

Uh oh!

c-salomonsen commented Feb 11, 2025

Uh oh!

c-salomonsen commented Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Johanmkr commented Feb 12, 2025

Uh oh!

c-salomonsen commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Johanmkr commented Feb 13, 2025

Uh oh!

Johanmkr commented Feb 13, 2025

Uh oh!

c-salomonsen commented Feb 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

c-salomonsen commented Feb 11, 2025 •

edited

Loading

c-salomonsen commented Feb 11, 2025 •

edited

Loading

c-salomonsen commented Feb 11, 2025 •

edited

Loading

c-salomonsen commented Feb 12, 2025 •

edited

Loading