-
Notifications
You must be signed in to change notification settings - Fork 1
fix: profile and localizing #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughThis PR restructures a Snakemake-based drug screening workflow, replacing configuration-driven output directories with hardcoded "results" paths, refactoring ZINC download logic from HTTP requests to curl-based chunking with local assembly, introducing new docking preparation rules with spacing calculations, and adding comprehensive logging across preparation and analysis rules. Changes
Sequence Diagram(s)sequenceDiagram
participant Config
participant Snakefile
participant Rules
participant Scripts
Note over Config,Scripts: OLD FLOW: Config-driven Paths
Config->>Snakefile: OUTPUT_DIR, INPUT_DIR, TMP_DIR
Snakefile->>Rules: Global vars (INPUT_DIR, OUTPUT_DIR, etc.)
Rules->>Scripts: Path references via config
Scripts->>Scripts: Write to TMP_DIR, INPUT_DIR, OUTPUT_DIR
Note over Config,Scripts: NEW FLOW: Hardcoded Relative Paths
Config->>Snakefile: EXPERIMENT_NAME, ZINC_MIRROR
Snakefile->>Rules: Hardcoded "results", "scratch", "docking"
Rules->>Rules: prepare_docking_local/ligand (new)
Rules->>Scripts: Direct path strings
Scripts->>Scripts: Write to "scratch", "prepared", "minimized", "grid"
sequenceDiagram
participant Config
participant ZINCdownload
participant ZINC_Server
participant Local_Storage
Note over Config,Local_Storage: OLD: Single HTTP Request
Config->>ZINCdownload: dataset, name
ZINCdownload->>ZINC_Server: requests.get(URL/file.pdbqt.gz)
ZINC_Server-->>ZINCdownload: Single large file
ZINCdownload->>Local_Storage: Save to INPUT_DIR/ZINC/...
Note over Config,Local_Storage: NEW: Chunked Download with Assembly
Config->>ZINCdownload: WEIGHT, LOGP, REACT, PURCHASE, PH, CHARGE
ZINCdownload->>ZINCdownload: Generate tranche/subset combinations
loop For Each Chunk
ZINCdownload->>ZINC_Server: curl (with mirror support)
ZINC_Server-->>ZINCdownload: Chunk file
ZINCdownload->>Local_Storage: Store chunk + SHA-256 (hashes.txt)
end
ZINCdownload->>ZINCdownload: Decompress + concatenate chunks
ZINCdownload->>Local_Storage: Final gzip assembly + checksum
ZINCdownload->>Local_Storage: Cleanup chunk files
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Poem
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (10)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
porting the workflow to a new cluster. Turned out, that many design decisions very faulty. Particularly, the necessity to have so many absolute paths is gone.
Note, the porting and fixing is only half-done: The rest will be more fine-grained.
Summary by CodeRabbit
Release Notes
New Features
Chores