MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default by redsun82 · Pull Request #19744 · github/codeql

redsun82 · 2025-06-12T14:56:13Z

use --threads=0 and 2GB per thread for --ram by default
also fix a bug where the order of model generation was determined by the order in the download.json file of the experiment rather than the order in the config file
allow configuring --ram and --threads in the MaD generator scripts

The review should ignore ae3bbb0 and 5df292c in order to not look at black formatting changes.

* fix a bug where the order of model generation was determined by the order in the `download.json` file of the experiment rather than the order in the config file * allow configuring `--ram` and `--threads` in the MaD generator scripts * use no `--ram` and `--threads=0` by default in the bulk generator (single generator defaults are left unchanged) * allow to pass `--dca` multiple times, taking DBs from experiments listed last. This allows to run a subset of the sources in a "fixup" experiment and use it to "patch" a previous run without rerunning everything.

The standalone MaD generator now uses `0` for threads and throttles the RAM to use 2GB per thread by default. Also, replaced the hand-written argument parsing with `argparse`.

Copilot

Pull Request Overview

This PR enhances the MaD generator scripts by adding configurable resource flags, fixing model-generation ordering, and improving DCA experiment support.

Replace manual argument parsing in generate_mad.py with argparse, introducing --threads and --ram (default 0 threads and 2 GB per thread).
Update bulk_generate_mad.py to propagate these new flags into the generator, support multiple --dca runs, and remove the stale git-status precheck.
Fix bug where model-generation order followed download.json rather than the user’s config.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
misc/scripts/models-as-data/generate_mad.py	Switched to `argparse`, added `threads`/`ram` handling, and updated default RAM
misc/scripts/models-as-data/bulk_generate_mad.py	Changed `generate_models` signature to accept CLI args, wired new flags, updated DCA download loop

Comments suppressed due to low confidence (2)

misc/scripts/models-as-data/bulk_generate_mad.py:228

Update this function’s docstring to document the new args parameter (its type, purpose, and which CLI flags it carries).

def generate_models(config, args, project: Project, database_dir: str) -> None:

misc/scripts/models-as-data/generate_mad.py:143

Add tests (unit or integration) to verify the behavior of the new CLI flags (--threads, --ram, --with-*) and defaulting logic in generate_mad.py.

generator = p.parse_args(namespace=Generator())

misc/scripts/models-as-data/generate_mad.py

Models are regenerated with the fix from #19744 which corrects the order of generation.

redsun82 · 2025-06-19T09:04:10Z

aw, the apache/dubbo problem has striked again, that means this doesn't fully solve it as I thought it would (but it does seem to make it rarer)

redsun82 added 3 commits June 12, 2025 16:23

MaD generator: change default thread and ram

39a3623

The standalone MaD generator now uses `0` for threads and throttles the RAM to use 2GB per thread by default. Also, replaced the hand-written argument parsing with `argparse`.

MaD generator: run black formatter`

ae3bbb0

Copilot AI review requested due to automatic review settings June 12, 2025 14:56

Copilot AI reviewed Jun 12, 2025

View reviewed changes

misc/scripts/models-as-data/generate_mad.py Show resolved Hide resolved

redsun82 mentioned this pull request Jun 12, 2025

Rust: Use QL computed canonical paths in MaD Field tokens #19667

Merged

redsun82 added 3 commits June 13, 2025 08:42

Merge branch 'main' into redsun82/mad-generator

f7266c9

MaD generator: really fix ordering problem

1a36374

MaD generator: apply black formatting to all sources

5df292c

redsun82 added a commit that referenced this pull request Jun 13, 2025

Rust: regenerate models

118456d

Models are regenerated with the fix from #19744 which corrects the order of generation.

This was referenced Jun 13, 2025

Rust: regenerate models #19748

Merged

CI: fix python version #19765

Merged

redsun82 requested review from MathiasVP and paldepind June 16, 2025 12:13

Merge branch 'main' into redsun82/mad-generator

24cfc84

redsun82 mentioned this pull request Jun 19, 2025

Rust: adapt model generation to new format #19819

Merged

redsun82 merged commit 24cfc84 into main Jun 23, 2025
23 of 30 checks passed

redsun82 deleted the redsun82/mad-generator branch June 23, 2025 07:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default#19744

MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default#19744
redsun82 merged 7 commits intomainfrom
redsun82/mad-generator

redsun82 commented Jun 12, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

redsun82 commented Jun 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

redsun82 commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

redsun82 commented Jun 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

redsun82 commented Jun 12, 2025 •

edited

Loading