Dataloading utils by daler3 · Pull Request #85 · OpenMined/PyVertical

daler3 · 2020-12-22T11:52:20Z

Description

Work in progress pull request for dataloading utils, dataloaders and datasets.

Affected Dependencies

Currently using PySyft 2.0. To be changed to not using PySyft at all, or eventually PySyft 3.0

How has this been tested?

Manually, unit and integration tests to be properly added

Checklist

I have followed the Contribution Guidelines and Code of Conduct
I have commented my code following the OpenMined Styleguide
I have labeled this PR with the relevant Type labels
My changes are covered by tests

reformatted and added method to split and directly create vertical federated dataset

review-notebook-app · 2020-12-22T11:52:24Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

daler3 · 2020-12-22T11:59:35Z

To Resolve #81

Updated split_data_create_vertical_dataset to match with current dataset classes (i.e. samplesetwithlabels).

…ents

TTitcombe · 2021-01-09T16:53:24Z

examples/dualheaded_datautils/dataloaders.py

+import datasets
+
+
+"""I think this is not needed anymore"""


Do you mean we don't need the partitioned dataloader?

No, I mean that the default pytorch dataloader in PyTorch works, so we do not need a custom one (for how it is done now). See the notebook for an example.

TTitcombe · 2021-01-09T16:53:51Z

examples/dualheaded_datautils/datasets.py

+        self.values = torch.Tensor(values) if is_labels else torch.stack(values)
+
+        self.worker_id = None 
+        if worker_id != None: 


that can simplify to if worker_id:

TTitcombe · 2021-01-09T16:54:40Z

examples/dualheaded_datautils/datasets.py

+        fmt_str = "FederatedDataset\n"
+        fmt_str += f"    Distributed accross: {', '.join(str(x) for x in self.workers)}\n"
+        fmt_str += f"    Number of datapoints: {self.__len__()}\n"
+        return fmt_str


newline at the end of the file

TTitcombe · 2021-01-09T16:54:58Z

examples/dualheaded_datautils/enhancedSplitWorkers.py

+        self.dataset = dataset #It can also be None, and then it would be only computational
+        self.model = model 
+
+        self.level = level if level >= 0 else 0 #it should start from zero, otherwise throw error #TODO: implement error throwing


simplify to max(level, 0)

TTitcombe · 2021-01-09T16:55:29Z

examples/dualheaded_datautils/utils.py

+This code is meant to be used with dual-headed Neural Networks, where there are a bunch of different workers, 
+which agrees on the labels, and there is a server  with the labels only. 
+Code built upon: 
+- Abbas Ismail's (@abbas5253) work on dual-headed NN. In particular, check Configuration 1: 


Does this PR require abbas' PR to be merged?

TTitcombe · 2021-01-09T16:55:55Z

examples/dualheaded_datautils/utils.py

+        the third the index, which is to keep track of the same data point. 
+    """    
+
+    if worker_list == None:


if worker_list: or if worker_list is None:

daler3 added 16 commits October 18, 2020 13:29

Started exploration notebook with synthea data

327e663

Feature loading

d92667a

Added model and training loop

fd333ca

Added confusion matrix

30a2a38

started experimenting with dualhead

5154a38

Added verticalfederateddataset class

c782ddc

Added dataset parameter to split_data

465a91c

Reformatted and added create vertical method

d4615e3

reformatted and added method to split and directly create vertical federated dataset

added comments and TODOs to the dataset file

ca1ff75

first version of verticalFederatedDataLoader

693a334

Added comments for TODOs

f566e37

corrected index of last tensor in split_data

e382c6d

compacted sum of the len in verticalfeddataloader

74bafec

Uploaded skeleton for dualheaded loaders

bb8cae7

Initial commit

c6eba1f

dataloaders and datasets for dualheaded

593a1ff

daler3 added 5 commits December 27, 2020 11:19

Updated utility functions

a3ad412

Updated split_data_create_vertical_dataset to match with current dataset classes (i.e. samplesetwithlabels).

updated datasets

901dff8

updated to use custom dataloaders, updated example notebook

8bdc9f3

added enhanced worker class (wip)

54a4f16

changed workers with worker's ids in dictionaries; added models' segm…

f21d8af

…ents

TTitcombe requested changes Jan 9, 2021

View reviewed changes

Addressed Tom's comments

96ed8e9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dataloading utils#85

Dataloading utils#85
daler3 wants to merge 22 commits intoOpenMined:masterfrom
daler3:dataloading-utils

daler3 commented Dec 22, 2020

Uh oh!

review-notebook-app bot commented Dec 22, 2020

Uh oh!

daler3 commented Dec 22, 2020

Uh oh!

TTitcombe Jan 9, 2021

Uh oh!

daler3 Jan 15, 2021

Uh oh!

TTitcombe Jan 9, 2021

Uh oh!

TTitcombe Jan 9, 2021

Uh oh!

TTitcombe Jan 9, 2021

Uh oh!

TTitcombe Jan 9, 2021

Uh oh!

TTitcombe Jan 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

daler3 commented Dec 22, 2020

Description

Affected Dependencies

How has this been tested?

Checklist

Uh oh!

review-notebook-app bot commented Dec 22, 2020

Uh oh!

daler3 commented Dec 22, 2020

Uh oh!

TTitcombe Jan 9, 2021

Choose a reason for hiding this comment

Uh oh!

daler3 Jan 15, 2021

Choose a reason for hiding this comment

Uh oh!

TTitcombe Jan 9, 2021

Choose a reason for hiding this comment

Uh oh!

TTitcombe Jan 9, 2021

Choose a reason for hiding this comment

Uh oh!

TTitcombe Jan 9, 2021

Choose a reason for hiding this comment

Uh oh!

TTitcombe Jan 9, 2021

Choose a reason for hiding this comment

Uh oh!

TTitcombe Jan 9, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants