Updated readme and comment.

otiliastr · otiliastr · commit 9154519a9824 · 2022-01-17T18:51:37.000-08:00
diff --git a/research/gam/README.md b/research/gam/README.md
@@ -40,6 +40,20 @@ More details can be found in our
 [slides](https://drive.google.com/open?id=1tWEMoyrbLnzfSfTfYFi9eWgZWaPKF3Uu) or
 [poster](https://drive.google.com/file/d/1BZNR4B-xM41hdLLqx4mLsQ4KKJOhjgqV/view).
 
+## Updated Results
+A bug was discovered in the implementation of the GAM agreement regularization term after publication. We have fixed the bug (PR #82) and have rerun the affected experiments. Below are the updated results (note that the GAM* results are not affected). 
+
+Dataset  | Method      | Updated Accuracy (mean ± stderr)
+-------- | :---------: | :-------------: 
+Cora     | MLP + GAM   | 80.2 ± 0.31 
+&nbsp;   | GCN + GAM   | 84.8 ± 0.06 
+Citeseer | MLP + GAM   | 73.2 ± 0.06
+&nbsp;   | GCN + GAM   | 72.2 ± 0.44
+Pubmed   | MLP + GAM   | 75.6 ± 0.07
+&nbsp;   | GCN + GAM   | 81.0 ± 0.09
+
+Although some of these numbers are lower than what was originally reported, the takeaways presented in our paper still hold: GAM adds a significant boost to the original base models, and also performs better than other forms of regularization reported in our paper. Nevertheless, we appologise for any inconvenience caused by this bug!
+
 ## How to run
 
 To run GAM on a graph-based dataset (e.g., Cora, Citeseer, Pubmed), from this
diff --git a/research/gam/gam/trainer/trainer_cotrain.py b/research/gam/gam/trainer/trainer_cotrain.py
@@ -356,7 +356,10 @@ def _select_samples_to_label(self, data, trainer_cls, session):
     """
     # Select the candidate samples for self-labeling, and make predictions.
     # Remove the validation and test samples from the unlabeled data, if there,
-    # to avoid self-labeling them.
+    # to avoid self-labeling them. We could potentially leave the test edges
+    # but once a node self-labeled, is label is fixed for the remaining 
+    # co-train iterations, and would not take advantage of the improved
+    # versions of the model.
     indices_unlabeled = data.get_indices_unlabeled()
     eval_ind = set(data.get_indices_val()) | set(data.get_indices_test())
     indices_unlabeled = np.asarray(
@@ -633,7 +636,9 @@ def train(self, data, **kwargs):
       logging.info(
           '--------- Cotrain step %6d | Accuracy val: %10.4f | '
           'Accuracy test: %10.4f ---------', step, val_acc, test_acc)
-
+      logging.info(
+          'Best validation acc: %.4f, corresponding test acc: %.4f at '
+          'iteration %d', best_val_acc, test_acc_at_best, iter_at_best)
       if self.first_iter_original and step == 0:
         logging.info('No self-labeling because the first iteration trains the '
                      'original classifier for evaluation purposes.')