Skip to content

Conversation

@shbhmexe
Copy link
Contributor

Summary

This PR fixes the scripts used to run gitdm analytics so they correctly locate the checked-in engine (src/cncfdm.py) and config base directory, and fixes the PowerShell wrapper so it works on Windows.

Root causes

  • src/rerun_data.sh referenced a non-existent ./all.sh.
  • Multiple runner scripts hard-coded ~/dev/cncf/gitdm/cncfdm.py and -b ~/dev/cncf/gitdm/, but the repo layout uses src/cncfdm.py and expects -b .../src/.
  • gitdm.ps1 used bash heredoc syntax and did not correctly invoke py -2 or forward piped input.

What changed (high level)

  • gitdm.ps1: reliable Python 2 detection/invocation and pipeline stdin passthrough.
  • src/rerun_data.sh: call ./run_all.sh instead of missing ./all.sh.
  • src/run_all.sh, src/all_no_map.sh, src/all_with_map.sh: use GITDM_HOME (script-relative) rather than hard-coded ~/dev/... paths.
  • src/run_no_map.sh, src/run_with_map.sh, src/run_for_rels_no_map.sh, src/run_for_rels_strict.sh, src/anyrepo.sh, src/anyreporange.sh, src/multirepo.sh, src/commits_in_ranges.sh: derive GITDM_HOME and use "$GITDM_HOME/src/cncfdm.py" with -b "$GITDM_HOME/src/".

Impact

  • Restores the documented regeneration workflow (src/rerun_data.sh) by fixing missing/broken script targets and engine paths.
  • Improves reliability for users who don’t have the repo checked out exactly at ~/dev/cncf/gitdm/.
  • Does not change the analysis logic or output formats, only how scripts locate and invoke the existing engine.

Validation

  • Verified all modified scripts now reference the repo-local engine path and no longer require a ~/dev/cncf/gitdm/cncfdm.py file.
  • On this machine I can’t run an end-to-end cncfdm execution because no Python 2/PyPy2 interpreter is installed; changes are isolated to wrappers/runner scripts and preserve existing flags/behavior.

Confirmation

These changes are focused on correctness of script execution paths and wrapper behavior; they do not change the underlying gitdm/cncfdm analysis behavior.

Several runner scripts referenced ~/dev/cncf/gitdm/cncfdm.py, but the engine
lives under src/cncfdm.py. This breaks common workflows driven by rerun_data.sh
(run_no_map/run_with_map, strict/no-map release runs, multirepo runs, and
commits-in-range analysis) unless the user has a custom copy/symlink.

Also fix gitdm.ps1 on Windows PowerShell:
- remove bash heredoc usage
- correctly invoke "py -2" with arguments
- forward piped stdin to cncfdm.py when used in a pipeline

Signed-off-by: shbhmexe <shubhushukla586@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant