Create 11_differences_original_replication_package.md

carlosparadis · web-flow · commit c7e07591818f · 2022-12-12T01:52:10.000-10:00
diff --git a/pages/11_differences_original_replication_package.md b/pages/11_differences_original_replication_package.md
@@ -0,0 +1,32 @@
+---
+layout: default
+title: Differences Between Original Work and Replication Package
+parent: Home
+nav_order: 11
+---
+
+## 11.1 How the Analytic Approaches Used in the Original Study vs. the Replication Differ
+
+As mentioned, the software used two years ago was no longer available for our replication, but also, we learned a lot from continuing to use the techniques described in the original study. In particular, we no longer employ the custom 1st-percentile-trimming-threshold calculated outside Tetrad but instead use the Majority Ensemble Method included within Tetrad for running the FGES Causal Discovery algorithm. We’ve learned in the context of other projects that both approaches to trimming which edges to keep from bootstrapped searches largely create the same degree of trimming but we gain in reduced “outside the main tool” effort.  The following table summarizes the methodological differences between the original study and the replication.
+
+The table quotes particular phrases from the study (Original Study column), how that phrase should be interpreted as a consequence of differences in approach or results experienced during the replication (Replication column), and comments on the impact that difference has on study conclusions (Impact of Difference column). Here by study “conclusion” we mean that even after controlling for work-rate variables in the current time period, and social-smell variables in the next time period, nevertheless, social smells in the current time period directly affect work-rate variables in the next time period, demonstrating persistence and longer-lasting effects from social smells on work rates. 
+
+
+### 11.1.1 Comparison of Search METHODOLOGIES
+
+| Original Study	| Replication	| Impact of Difference |
+|-----------------|-------------|----------------------|
+| “To perform causal discovery, we used the Fast Greedy Equivalent Search algorithm [25], Tetrad, version 6.8.0.” <br /> <br /> [Section 2.5; page 5 column 2]	 | We used Tetrad release version 7.1.2-2 instead of 6.8.0. |	Details of the results are slightly different throughout due to the use of a later version of the same algorithm and tool but general conclusion holds. | 
+|“found 14 CVEs (out of 121)… 88% of the CVEs do not engage in any such causal relationships with the socio-technical or work-rate variables. … the 107 (121-14) CVE indicator variables were dropped” <br /> <br /> [Section 2.5; page 7 column 2]	| "found 23 CVEs (out of 121)… 80% of the CVEs … the 98 (121-23) CVE indicator variables were dropped” | This result is made in reference to the initial screen. The use of a search algorithm (FGES), perhaps more sensitive than it was two years before during the original study, might contribute to discovering additional CVEs having a somewhat strong causal effect. This increases the workload of later steps but has no obvious impact on the conclusions yet.|
+| Last 4 of the 6 bullets following the lead-in phrase: ” To perform a bootstrap as part of causal discovery (further detail found in [27]), a researcher must:” <br /> <br /> [Section 2.5; page 7 column 1] | 	Replace the last 4 of the 6 bullets with a new bullet that simply states: "Use an appropriate ensemble method to determine which edges to retain in the final graph to be reported as the search result."	| The new statement simply generalizes the 4 “must” steps specified in the original study into a more general step. For the replication, we ended up not executing the 4 steps described in the study but instead directly employing the Majority ensemble method, which basically accomplishes the same intent: having an objective and principled way of trimming edges that don’t appear often in the bootstrapped search results. The criterion (majority of bootstrap search results vs. 1st percentile trim probability of no edge) is different in theory but achieves a consistent level of trimming in practice. (And is easier to do too and thus to replicate.) |
+| “Further detail on how the trimming threshold is used and how it can be determined can be found in [27].” <br /> <br /> [Section 2.5; page 7 column 1] |	Two years ago, we developed a Python script that uses the edge discovery rate for null variables to establish and apply a 1st-percentile trimming threshold. But in the replication, we applied the Majority Ensemble method instead.	| If after applying the majority ensemble method we find hardly any edges to/from null variables survive, then we are likely not overfitting the graph to the data. We only experienced 2 edges for 41 null variables. Underfitting doesn’t seem too likely either, however, as some edges did survive (2 edges) the trimming that the Majority Ensemble method applies.|
+|“Ensemble Method = Preserved, which tracks every time an edge appears in a bootstrapped search result” <br /> <br /> [Section 3.4; page 12 column 2] |	“Ensemble Method = Majority, an edge is assigned to a node pair only if some kind of edge appears in the majority of bootstrap sample search results, in which case, it’s assigned the most-frequent edge type that occurred.” |	As explained above, the trimming is built into the Majority Ensemble Method, saving much analyst effort. The only impact on study results are a few weaker causal edges appearing (or not appearing; or changes in their orientation). Both approaches should identify all strong true positive edges and all strong true negative edges correctly, assuming limited confounding and large enough sample size. |
+| “First step of the search is required to be symmetric, which tries both orientations for the first edge before committing to one of them” <br /> <br /> [Section 3.4; page 12 column 2] |	This list of search settings omits one: <br /> “Yes, (one edge) faithfulness should be assumed” |	Different releases of Tetrad differ as to which settings are considered default. I don’t know for sure if this setting was used two years ago but this is the setting I used for the replication. |
+| “we excluded the CVE ID indicator variables” <br /> <br /> [Section 3.4; page 12 column 2] |	“we excluded the null variables corresponding to CVE ID indicator variables” (i.e., rather than the CVE ID indicators themselves)	| This was ambiguously phrased in the original study. The intent was always the one stated in the application. |
+“We trimmed the resulting graph up to the 1st percentile (i.e., pairs of variables are ranked | by increasing frequency of no edge, and edges are only
+assigned to those pairs for which no edge appears less often than the 1st-percentile random edge frequency). We were able to set this low threshold because our dataset was large, allowing us to choose conservatively.” <br /> <br /> [Section 3.4; page 12 column 2] |	“Because the result of applying the Majority rule leaves so few null variables having any edges (only 2 edges for 42 null variables), the majority rule provides adequate trimming of non-null variable edges as well. Thus, we simply eliminate the null variables and reported the results.”	| Explained above. Primary impact is saving the analyst several steps replicating this study with no noticeable impact on study results except for some edge type details. |
+| “The results of causal discovery and trimming are indicated in Figure 9.” <br /> <br /> [Section 3.4; page 12 column 2]	| Leave out “and trimming” |	Explained above. Trimming is no longer a separate step but rather a processing step that is built into one of the optional settings provided by Tetrad for casual discovery that we now take advantage of. |
+| “Fourth, we assessed the homogeneity of the dataset and the generalizability of any casual patterns that might be discovered, with the results described earlier: 88% of the CVEs have no discernible idiosyncratic casual role…” <br /> <br /> [Section 3.4; page 12 column 1]	| 80% instead of 88% | Explained above. |
+
+## 11.2 How Results from the Original Study vs. the Replication Differ
+