Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
254 changes: 128 additions & 126 deletions constraints-dev.txt
Original file line number Diff line number Diff line change
@@ -1,114 +1,118 @@
absl-py==2.3.1 # via tensorboard
accelerate==1.10.1 # via peft, trl, -r requirements-cuda.txt, -r requirements-rocm.txt
aiofiles==24.1.0 # via -r requirements.txt
accelerate==1.12.0 # via peft, trl, -r requirements-cuda.txt, -r requirements-rocm.txt
aiofiles==25.1.0 # via -r requirements.txt
aiohappyeyeballs==2.6.1 # via aiohttp
aiohttp==3.12.15 # via fsspec
aiohttp==3.13.3 # via fsspec
aiosignal==1.4.0 # via aiohttp
annotated-types==0.7.0 # via pydantic
anyio==4.10.0 # via httpx, jupyter-server
anyio==4.12.1 # via httpx, jupyter-server
argon2-cffi==25.1.0 # via jupyter-server
argon2-cffi-bindings==25.1.0 # via argon2-cffi
arrow==1.3.0 # via isoduration
astroid==3.3.11 # via pylint
asttokens==3.0.0 # via stack-data
async-lru==2.0.5 # via jupyterlab
attrs==25.3.0 # via aiohttp, jsonschema, referencing
arrow==1.4.0 # via isoduration
astroid==4.0.3 # via pylint
asttokens==3.0.1 # via stack-data
async-lru==2.1.0 # via jupyterlab
attrs==25.4.0 # via aiohttp, jsonschema, referencing
babel==2.17.0 # via jupyterlab-server
beautifulsoup4==4.13.5 # via nbconvert
bitsandbytes==0.47.0 # via -r requirements-cuda.txt
bleach==6.2.0 # via nbconvert
cachetools==6.2.0 # via tox
certifi==2025.8.3 # via httpcore, httpx, requests, sentry-sdk
cffi==1.17.1 # via argon2-cffi-bindings
cfgv==3.4.0 # via pre-commit
beautifulsoup4==4.14.3 # via nbconvert
bitsandbytes==0.49.1 # via -r requirements-cuda.txt
bleach==6.3.0 # via nbconvert
cachetools==6.2.4 # via tox
causal-conv1d==1.6.0 # via mamba-ssm
certifi==2026.1.4 # via httpcore, httpx, requests, sentry-sdk
cffi==2.0.0 # via argon2-cffi-bindings
cfgv==3.5.0 # via pre-commit
chardet==5.2.0 # via tox
charset-normalizer==3.4.3 # via requests
click==8.2.1 # via wandb
charset-normalizer==3.4.4 # via requests
click==8.3.1 # via wandb
colorama==0.4.6 # via tox
comm==0.2.3 # via ipykernel, ipywidgets
contourpy==1.3.3 # via matplotlib
cycler==0.12.1 # via matplotlib
datasets==4.0.0 # via trl, -r requirements.txt
debugpy==1.8.16 # via ipykernel
datasets==4.5.0 # via trl, -r requirements.txt
debugpy==1.8.19 # via ipykernel
decorator==5.2.1 # via ipython
deepspeed==0.17.5 # via -r requirements-deepspeed.txt
deepspeed==0.18.4 # via -r requirements-deepspeed.txt
defusedxml==0.7.1 # via nbconvert
dill==0.3.8 # via datasets, multiprocess, pylint
dill==0.4.0 # via datasets, multiprocess, pylint
distlib==0.4.0 # via virtualenv
einops==0.8.1 # via deepspeed, flash-attn
executing==2.2.0 # via stack-data
einops==0.8.1 # via deepspeed, flash-attn, mamba-ssm
executing==2.2.1 # via stack-data
fastjsonschema==2.21.2 # via nbformat
filelock==3.19.1 # via datasets, huggingface-hub, torch, tox, transformers, virtualenv
filelock==3.20.3 # via datasets, huggingface-hub, torch, tox, transformers, virtualenv
flash-attn==2.7.4.post1 # via -c constraints-dev.txt.in, -r requirements-cuda.txt, -r requirements-rocm.txt
fonttools==4.59.2 # via matplotlib
fonttools==4.61.1 # via matplotlib
fqdn==1.5.1 # via jsonschema
frozenlist==1.7.0 # via aiohttp, aiosignal
fsspec==2025.3.0 # via datasets, huggingface-hub, torch
frozenlist==1.8.0 # via aiohttp, aiosignal
fsspec==2025.10.0 # via datasets, huggingface-hub, torch
gitdb==4.0.12 # via gitpython
gitpython==3.1.45 # via wandb
grpcio==1.74.0 # via tensorboard
gitpython==3.1.46 # via wandb
grpcio==1.76.0 # via tensorboard
h11==0.16.0 # via httpcore
hf-xet==1.1.9 # via huggingface-hub
hf-xet==1.2.0 # via huggingface-hub
hjson==3.1.0 # via deepspeed
httpcore==1.0.9 # via httpx
httpx==0.28.1 # via jupyterlab
huggingface-hub==0.34.4 # via accelerate, datasets, peft, tokenizers, transformers, -r requirements-dev.txt
identify==2.6.13 # via pre-commit
idna==3.10 # via anyio, httpx, jsonschema, requests, yarl
iniconfig==2.1.0 # via pytest
ipykernel==6.30.1 # via jupyter, jupyter-console, jupyterlab, -r requirements-dev.txt
ipython==9.5.0 # via ipykernel, ipywidgets, jupyter-console, -r requirements-dev.txt
httpx==0.28.1 # via datasets, jupyterlab
huggingface-hub==0.36.0 # via accelerate, datasets, kernels, peft, tokenizers, transformers, -r requirements-dev.txt
identify==2.6.16 # via pre-commit
idna==3.11 # via anyio, httpx, jsonschema, requests, yarl
iniconfig==2.3.0 # via pytest
ipykernel==7.1.0 # via jupyter, jupyter-console, jupyterlab, -r requirements-dev.txt
ipython==9.9.0 # via ipykernel, ipywidgets, jupyter-console, -r requirements-dev.txt
ipython-pygments-lexers==1.1.1 # via ipython
ipywidgets==8.1.7 # via jupyter
ipywidgets==8.1.8 # via jupyter
isoduration==20.11.0 # via jsonschema
isort==6.0.1 # via pylint, -r requirements-dev.txt
isort==7.0.0 # via pylint, -r requirements-dev.txt
jedi==0.19.2 # via ipython
jinja2==3.1.6 # via jupyter-server, jupyterlab, jupyterlab-server, nbconvert, torch
json5==0.12.1 # via jupyterlab-server
json5==0.13.0 # via jupyterlab-server
jsonpointer==3.0.0 # via jsonschema
jsonschema==4.25.1 # via jupyter-events, jupyterlab-server, nbformat
jsonschema-specifications==2025.4.1 # via jsonschema
jsonschema==4.26.0 # via jupyter-events, jupyterlab-server, nbformat
jsonschema-specifications==2025.9.1 # via jsonschema
jupyter==1.1.1 # via -r requirements-dev.txt
jupyter-client==8.6.3 # via ipykernel, jupyter-console, jupyter-server, nbclient
jupyter-client==8.8.0 # via ipykernel, jupyter-console, jupyter-server, nbclient
jupyter-console==6.6.3 # via jupyter
jupyter-core==5.8.1 # via ipykernel, jupyter-client, jupyter-console, jupyter-server, jupyterlab, nbclient, nbconvert, nbformat
jupyter-core==5.9.1 # via ipykernel, jupyter-client, jupyter-console, jupyter-server, jupyterlab, nbclient, nbconvert, nbformat
jupyter-events==0.12.0 # via jupyter-server
jupyter-lsp==2.3.0 # via jupyterlab
jupyter-server==2.17.0 # via jupyter-lsp, jupyterlab, jupyterlab-server, notebook, notebook-shim
jupyter-server-terminals==0.5.3 # via jupyter-server
jupyterlab==4.4.6 # via jupyter, notebook
jupyter-server-terminals==0.5.4 # via jupyter-server
jupyterlab==4.5.2 # via jupyter, notebook
jupyterlab-pygments==0.3.0 # via nbconvert
jupyterlab-server==2.27.3 # via jupyterlab, notebook
jupyterlab-widgets==3.0.15 # via ipywidgets
jupyterlab-server==2.28.0 # via jupyterlab, notebook
jupyterlab-widgets==3.0.16 # via ipywidgets
kernels==0.11.7 # via -r requirements-cuda.txt
kiwisolver==1.4.9 # via matplotlib
lark==1.2.2 # via rfc3987-syntax
liger-kernel==0.6.2 # via -r requirements-cuda.txt
llvmlite==0.45.0 # via numba
markdown==3.8.2 # via tensorboard
lark==1.3.1 # via rfc3987-syntax
librt==0.7.8 # via mypy
liger-kernel==0.6.4 # via -r requirements-cuda.txt
llvmlite==0.46.0 # via numba
mamba-ssm==2.3.0 # via -r requirements-cuda.txt, -r requirements-rocm.txt
markdown==3.10 # via tensorboard
markdown-it-py==4.0.0 # via rich
markupsafe==3.0.2 # via jinja2, nbconvert, werkzeug
matplotlib==3.10.6 # via -r requirements-dev.txt
matplotlib-inline==0.1.7 # via ipykernel, ipython
markupsafe==3.0.3 # via jinja2, nbconvert, werkzeug
matplotlib==3.10.8 # via -r requirements-dev.txt
matplotlib-inline==0.2.1 # via ipykernel, ipython
mccabe==0.7.0 # via pylint
mdurl==0.1.2 # via markdown-it-py
mistune==3.1.4 # via nbconvert
mistune==3.2.0 # via nbconvert
mpmath==1.3.0 # via sympy
msgpack==1.1.1 # via deepspeed
multidict==6.6.4 # via aiohttp, yarl
multiprocess==0.70.16 # via datasets
mypy==1.17.1 # via -r requirements-dev.txt
msgpack==1.1.2 # via deepspeed
multidict==6.7.0 # via aiohttp, yarl
multiprocess==0.70.18 # via datasets
mypy==1.19.1 # via -r requirements-dev.txt
mypy-extensions==1.1.0 # via mypy
nbclient==0.10.2 # via nbconvert
nbclient==0.10.4 # via nbconvert
nbconvert==7.16.6 # via jupyter, jupyter-server
nbformat==5.10.4 # via jupyter-server, nbclient, nbconvert
nest-asyncio==1.6.0 # via ipykernel
networkx==3.5 # via torch
ninja==1.13.0 # via deepspeed
nodeenv==1.9.1 # via pre-commit
notebook==7.4.5 # via jupyter
networkx==3.6.1 # via torch
ninja==1.13.0 # via causal-conv1d, deepspeed, mamba-ssm
nodeenv==1.10.0 # via pre-commit
notebook==7.5.2 # via jupyter
notebook-shim==0.2.4 # via jupyterlab, notebook
numba==0.62.1 # via -r requirements.txt
numpy==2.3.3 # via accelerate, bitsandbytes, contourpy, datasets, deepspeed, matplotlib, numba, pandas, peft, tensorboard, transformers, -r requirements-dev.txt, -r requirements.txt
numba==0.63.1 # via -r requirements.txt
numpy==2.3.5 # via accelerate, bitsandbytes, contourpy, datasets, deepspeed, matplotlib, numba, pandas, peft, tensorboard, transformers, -r requirements-dev.txt, -r requirements.txt
nvidia-cublas-cu12==12.4.5.8 # via nvidia-cudnn-cu12, nvidia-cusolver-cu12, torch
nvidia-cuda-cupti-cu12==12.4.127 # via torch
nvidia-cuda-nvrtc-cu12==12.4.127 # via torch
Expand All @@ -123,92 +127,90 @@ nvidia-nccl-cu12==2.21.5 # via torch
nvidia-nvjitlink-cu12==12.4.127 # via nvidia-cusolver-cu12, nvidia-cusparse-cu12, torch
nvidia-nvtx-cu12==12.4.127 # via torch
overrides==7.7.0 # via jupyter-server
packaging==25.0 # via accelerate, datasets, deepspeed, huggingface-hub, ipykernel, jupyter-events, jupyter-server, jupyterlab, jupyterlab-server, matplotlib, nbconvert, peft, pyproject-api, pytest, tensorboard, tox, transformers, wandb, -r requirements.txt
pandas==2.3.2 # via datasets
packaging==25.0 # via accelerate, bitsandbytes, causal-conv1d, datasets, deepspeed, huggingface-hub, ipykernel, jupyter-events, jupyter-server, jupyterlab, jupyterlab-server, kernels, mamba-ssm, matplotlib, nbconvert, peft, pyproject-api, pytest, tensorboard, tox, transformers, trl, wandb, -r requirements.txt
pandas==2.3.3 # via datasets
pandocfilters==1.5.1 # via nbconvert
parso==0.8.5 # via jedi
pathspec==0.12.1 # via mypy
peft==0.17.1 # via -r requirements.txt
pathspec==1.0.3 # via mypy
peft==0.18.1 # via -r requirements.txt
pexpect==4.9.0 # via ipython
pillow==11.3.0 # via matplotlib, tensorboard
platformdirs==4.4.0 # via jupyter-core, pylint, tox, virtualenv, wandb
pillow==12.1.0 # via matplotlib, tensorboard
platformdirs==4.5.1 # via jupyter-core, pylint, tox, virtualenv, wandb
pluggy==1.6.0 # via pytest, tox
pre-commit==4.3.0 # via -r requirements-dev.txt
prometheus-client==0.22.1 # via jupyter-server
pre-commit==4.5.1 # via -r requirements-dev.txt
prometheus-client==0.24.1 # via jupyter-server
prompt-toolkit==3.0.52 # via ipython, jupyter-console
propcache==0.3.2 # via aiohttp, yarl
protobuf==6.32.0 # via tensorboard, wandb
psutil==7.0.0 # via accelerate, deepspeed, ipykernel, peft
propcache==0.4.1 # via aiohttp, yarl
protobuf==6.33.4 # via tensorboard, wandb
psutil==7.2.1 # via accelerate, deepspeed, ipykernel, peft
ptyprocess==0.7.0 # via pexpect, terminado
pure-eval==0.2.3 # via stack-data
py-cpuinfo==9.0.0 # via deepspeed, -r requirements.txt
pyarrow==21.0.0 # via datasets
pycparser==2.22 # via cffi
pydantic==2.11.7 # via deepspeed, pylint-pydantic, wandb, -r requirements.txt
pydantic-core==2.33.2 # via pydantic
pyarrow==23.0.0 # via datasets
pycparser==2.23 # via cffi
pydantic==2.12.5 # via deepspeed, pylint-pydantic, wandb, -r requirements.txt
pydantic-core==2.41.5 # via pydantic
pygments==2.19.2 # via ipython, ipython-pygments-lexers, jupyter-console, nbconvert, pytest, rich
Comment on lines +151 to 153
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Pydantic 2.12.x breaking changes migration guide

💡 Result:

Below are the important breaking changes introduced in Pydantic 2.12.x and a short, actionable migration checklist.

Key breaking changes (what changed and impact)

  • JSON Schema: dict-valued json_schema_extra is now merged rather than replaced — adjust code that relied on previous replace semantics when generating/modifying JSON Schema. [1]
  • Annotation handling/refactor: annotation injection for known/generic types was refactored and some annotation compatibility errors are moved to validation-time (not import-time) — tests that previously passed may now raise validation errors at runtime. Audit generics, Annotated metadata, and any custom Generic model usage. [1]
  • serialize_as_any / serialization behavior: serialize-as-any behavior changed (affects custom serializers, computed fields, and RootModel serialization); existing serializers may need updates. New flags like exclude_computed_fields and serialize options were added. Run serialization tests and review custom serializer implementations. [1][2]
  • Removal/changes in internal metadata: typed_dict_cls removed from CoreMetadata; don’t rely on that internal metadata. [1]
  • JSON Schema 'allOf' workarounds removed — code expecting the old workarounds in generated JSON Schema should be updated. [1]
  • Examples API: passing a dict to Examples is deprecated — update any code constructing Examples from dicts. [1]
  • Validator/validator-signature behavior: some changes to validator inspection and error messages (including fixes around evaluating annotations) — review model validators and after-model validators (classmethod after-model validators previously erroring were later changed to emit warnings in patch releases). [1][2]
  • Python 3.14 compatibility notes / MISSING sentinel: new MISSING sentinel and initial Python 3.14 support; be aware of MISSING semantics and Python-version-specific edge cases. [1][2]

Migration checklist (practical steps)

  1. Read the 2.12 changelog and release notes (linked below) to see full list of changes and follow-up patch fixes. [1][2].
  2. Pin pydantic-core to the compatible version used by 2.12 (use the release notes / pip metadata) and run your test suite under the target Python versions you support (especially 3.10–3.14). [2]
  3. Run the whole test suite with strict failure (serialization, validation, JSON schema, and any runtime-generate-schema code). Fix failures in:
    • custom validators/serializers (update signatures or logic per new inspection rules). [1][2]
    • serialization paths relying on serialize_as_any, RootModel, or computed fields (consider new exclude_computed_fields option). [1][2]
    • JSON Schema generation code (handle merged json_schema_extra; remove assumptions about 'allOf' workarounds). [1]
  4. Replace any reliance on internal CoreMetadata fields (typed_dict_cls) and stop passing dicts to Examples. [1]
  5. If you used Annotated metadata mutability patterns (re-using FieldInfo objects), review and replace with supported patterns (docs/examples updated). [2]
  6. If you saw after-model validator classmethod errors in 2.12.0, update to accept the new behavior or apply the patched warning semantics in later 2.12.x patch releases; consult the 2.12.x changelog for which patch fixed/regressed which behaviour. [2]
  7. When upgrading in production: roll out in a canary environment, pin to an explicit 2.12.x patch (not just 2.12), and iterate fixes before wide rollout.

Important links (read first)

  • Official changelog (detailed breaking-change list + patch notes). [1]
  • GitHub release notes for v2.12 (notes on specific fixes and follow-up patch releases). [2]
  • v2.12 announcement / upgrade blog post (high-level highlights and migration pointers). [3]
  • Version-compatibility policy (what the project considers breaking vs. allowed changes). [4]

References
[1] Pydantic changelog (v2.12 section).
[2] Pydantic GitHub releases (v2.12.0 → v2.12.x).
[3] Pydantic v2.12 announcement blog post.
[4] Pydantic version compatibility docs.

Would you like a targeted migration patch plan for your codebase? If so, paste a few representative models/serializers/validator snippets and I’ll produce exact changes.


🏁 Script executed:

# Check constraints-dev.txt for full context
head -n 160 constraints-dev.txt | tail -n 30

Repository: instructlab/training

Length of output: 1701


🏁 Script executed:

# Search for Pydantic imports and usage in codebase
rg -i "from pydantic|import pydantic" --type py -A 2 | head -50

Repository: instructlab/training

Length of output: 360


🏁 Script executed:

# Check for Pydantic model definitions in codebase
rg "BaseModel|ConfigDict|field_validator" --type py | head -30

Repository: instructlab/training

Length of output: 1015


🌐 Web query:

deepspeed pydantic 2.12 compatibility

💡 Result:

Short answer: DeepSpeed has migrated to Pydantic v2 (PRs/merges for v2 support exist), so it is generally compatible with Pydantic v2 releases — but you must use a matching pydantic_core version (Pydantic v2.12 enforces an exact pydantic-core pairing). If you see errors, upgrade DeepSpeed to the latest release/commit and ensure pydantic and pydantic-core versions match. (app.semanticdiff.com)

What to do if you hit problems

  • Upgrade DeepSpeed (pip install --upgrade deepspeed or use the repo's latest master/commit where Pydantic v2 migration was merged). (pypi.org)
  • Install matching pydantic_core for your pydantic version (Pydantic v2.12 will error if pydantic_core is incompatible). (pydantic.dev)
  • If errors persist, check DeepSpeed GitHub issues for similar reports (there are older pydantic-related issues) and file an issue with your ds_report output. (github.com)

If you want, tell me the DeepSpeed version and your installed pydantic / pydantic-core versions and I’ll verify compatibility and give exact pip commands.

Citations:


🌐 Web query:

wandb pydantic 2.12 compatibility requirements

💡 Result:

Short answer: Recent wandb SDK releases declare pydantic v2 compatibility (wandb added pydantic>=2.6,<33 in v0.19.0 and later relaxed support for v1/v2), so pydantic 2.12 is within the allowed range — but there are user-reported warnings/bugs when importing wandb with pydantic 2.12 that may require a patch or a temporary pin to an earlier 2.x release. [1][2][3]

Details / action:

  • wandb declared pydantic>=2.6,<33 as a dependency (v0.19.0+). [1]
  • Pydantic 2.12 was released upstream. [3]
  • There is a reported wandb GitHub issue describing warnings with pydantic 2.12+ (UnsupportedFieldAttributeWarning); consider pinning to a known-working 2.x (e.g., 2.11.x) until a wandb fix/release resolves it. [2]

Sources:
[1] wandb SDK release notes (v0.19.0 / changelog).
[2] wandb GitHub issue reporting UnsupportedFieldAttributeWarning with Pydantic 2.12+.
[3] Pydantic v2.12 release announcement.


🏁 Script executed:

# Read the config.py file to check for usage patterns affected by breaking changes
wc -l src/instructlab/training/config.py

Repository: instructlab/training

Length of output: 102


🏁 Script executed:

# Check the actual config.py implementation
cat -n src/instructlab/training/config.py | head -100

Repository: instructlab/training

Length of output: 3328


🏁 Script executed:

# Continue reading config.py to see model implementations and validators
cat -n src/instructlab/training/config.py | tail -n +100 | head -150

Repository: instructlab/training

Length of output: 5630


🏁 Script executed:

# Check if there are any custom serializers, computed fields, or RootModel usage
rg -i "computed_field|root_model|field_serializer|model_serializer|serialize_as_any" --type py

Repository: instructlab/training

Length of output: 46


🏁 Script executed:

# Check for any JSON schema generation or Examples usage in config
rg "json_schema|Examples|model_json_schema" --type py

Repository: instructlab/training

Length of output: 46


🏁 Script executed:

# Check rest of config.py for more details on LoraOptions and any other Config class usage
cat -n src/instructlab/training/config.py | sed -n '135,200p'

Repository: instructlab/training

Length of output: 2234


🏁 Script executed:

# Search for any old-style Config class usage (v1 pattern)
rg "class Config:" --type py -B 5

Repository: instructlab/training

Length of output: 413


🏁 Script executed:

# Check requirements.txt and constraints-dev.txt to understand version pinning strategy
cat requirements.txt | grep -i "pydantic\|deepspeed\|wandb"

Repository: instructlab/training

Length of output: 79


Migrate LoraOptions to Pydantic v2 ConfigDict and evaluate wandb compatibility with 2.12.5.

Pydantic 2.12.5 has documented breaking changes (JSON Schema merging, annotation handling, serialization behavior, and validator signatures). The pinned versions are compatible with deepspeed and pydantic-core, but:

  1. LoraOptions (line 135–136) uses deprecated v1-style class Config: — migrate use_enum_values = True to model_config = ConfigDict(use_enum_values=True) to ensure proper v2 behavior.

  2. wandb has reported UnsupportedFieldAttributeWarning issues with Pydantic 2.12+ — confirm that the current wandb version resolves these warnings, or consider pinning to an earlier 2.x release (e.g., 2.11.x) as a temporary workaround until wandb releases a compatible patch.

  3. Loosen the pydantic constraint in requirements.txt — currently pydantic>=2.7.0 is too broad given 2.12.x breaking changes; update to pydantic>=2.7.0,<2.13 or explicit 2.12.x series if testing confirms stability.

🤖 Prompt for AI Agents
In `@constraints-dev.txt` around lines 151 - 153, The LoraOptions dataclass uses a
v1-style inner class Config; replace that with Pydantic v2 syntax by adding
model_config = ConfigDict(use_enum_values=True) on LoraOptions (and remove the
class Config block) so enum serialization follows v2 semantics, then run tests.
Also verify wandb compatibility by checking the installed wandb version for
known UnsupportedFieldAttributeWarning with Pydantic 2.12+ and either
upgrade/downgrade wandb to a version that suppresses the warning or pin wandb
accordingly. Finally, tighten the pydantic requirement in requirements (change
loose "pydantic>=2.7.0" to a constrained range like "pydantic>=2.7.0,<2.13" or a
specific 2.12.x) after confirming tests pass.

pylint==3.3.8 # via pylint-plugin-utils, pylint-pydantic, -r requirements-dev.txt
pylint==4.0.4 # via pylint-plugin-utils, pylint-pydantic, -r requirements-dev.txt
pylint-plugin-utils==0.9.0 # via pylint-pydantic
pylint-pydantic==0.3.5 # via -r requirements-dev.txt
pyparsing==3.2.3 # via matplotlib
pyproject-api==1.9.1 # via tox
pytest==8.4.1 # via -r requirements-dev.txt
pylint-pydantic==0.4.1 # via -r requirements-dev.txt
pyparsing==3.3.1 # via matplotlib
pyproject-api==1.10.0 # via tox
pytest==9.0.2 # via -r requirements-dev.txt
python-dateutil==2.9.0.post0 # via arrow, jupyter-client, matplotlib, pandas
python-json-logger==3.3.0 # via jupyter-events
python-json-logger==4.0.0 # via jupyter-events
pytz==2025.2 # via pandas
pyyaml==6.0.2 # via accelerate, datasets, huggingface-hub, jupyter-events, peft, pre-commit, transformers, wandb, -r requirements.txt
pyzmq==27.0.2 # via ipykernel, jupyter-client, jupyter-console, jupyter-server
referencing==0.36.2 # via jsonschema, jsonschema-specifications, jupyter-events
regex==2025.8.29 # via transformers
pyyaml==6.0.3 # via accelerate, datasets, huggingface-hub, jupyter-events, kernels, peft, pre-commit, transformers, wandb, -r requirements.txt
pyzmq==27.1.0 # via ipykernel, jupyter-client, jupyter-console, jupyter-server
referencing==0.37.0 # via jsonschema, jsonschema-specifications, jupyter-events
regex==2026.1.15 # via transformers
requests==2.32.5 # via datasets, huggingface-hub, jupyterlab-server, transformers, wandb
rfc3339-validator==0.1.4 # via jsonschema, jupyter-events
rfc3986-validator==0.1.1 # via jsonschema, jupyter-events
rfc3987-syntax==1.1.0 # via jsonschema
rich==14.1.0 # via -r requirements.txt
rpds-py==0.27.1 # via jsonschema, referencing
ruff==0.12.11 # via -r requirements-dev.txt
safetensors==0.6.2 # via accelerate, peft, transformers
send2trash==1.8.3 # via jupyter-server
sentry-sdk==2.35.1 # via wandb
setuptools==80.9.0 # via jupyterlab, tensorboard
rich==14.2.0 # via -r requirements.txt
rpds-py==0.30.0 # via jsonschema, referencing
ruff==0.14.13 # via -r requirements-dev.txt
safetensors==0.7.0 # via accelerate, peft, transformers
send2trash==2.1.0 # via jupyter-server
sentry-sdk==2.49.0 # via wandb
setuptools==80.9.0 # via jupyterlab, mamba-ssm, tensorboard
six==1.17.0 # via python-dateutil, rfc3339-validator
smmap==5.0.2 # via gitdb
sniffio==1.3.1 # via anyio
soupsieve==2.8 # via beautifulsoup4
soupsieve==2.8.2 # via beautifulsoup4
stack-data==0.6.3 # via ipython
sympy==1.13.1 # via torch
tensorboard==2.20.0 # via -r requirements-dev.txt
tensorboard-data-server==0.7.2 # via tensorboard
terminado==0.18.1 # via jupyter-server, jupyter-server-terminals
tinycss2==1.4.0 # via bleach
tokenizers==0.22.0 # via transformers
tomlkit==0.13.3 # via pylint
torch==2.6.0 # via accelerate, bitsandbytes, deepspeed, flash-attn, liger-kernel, peft, -c constraints-dev.txt.in, -r requirements.txt
tornado==6.5.2 # via ipykernel, jupyter-client, jupyter-server, jupyterlab, notebook, terminado
tox==4.29.0 # via tox-current-env, -r requirements-dev.txt
tokenizers==0.22.2 # via transformers
tomlkit==0.14.0 # via pylint
torch==2.6.0 # via accelerate, bitsandbytes, causal-conv1d, deepspeed, flash-attn, liger-kernel, mamba-ssm, peft, -c constraints-dev.txt.in, -r requirements.txt
tornado==6.5.4 # via ipykernel, jupyter-client, jupyter-server, jupyterlab, notebook, terminado
tox==4.34.1 # via tox-current-env, -r requirements-dev.txt
tox-current-env==0.0.16 # via -r requirements-dev.txt
tqdm==4.67.1 # via datasets, deepspeed, huggingface-hub, peft, transformers
traitlets==5.14.3 # via ipykernel, ipython, ipywidgets, jupyter-client, jupyter-console, jupyter-core, jupyter-events, jupyter-server, jupyterlab, matplotlib-inline, nbclient, nbconvert, nbformat
transformers==4.56.0 # via peft, trl, -r requirements.txt
triton==3.2.0 # via liger-kernel, torch
trl==0.22.1 # via -r requirements.txt
types-python-dateutil==2.9.0.20250822 # via arrow
types-pyyaml==6.0.12.20250822 # via -r requirements-dev.txt
types-requests==2.32.4.20250809 # via types-tqdm
transformers==4.57.6 # via mamba-ssm, peft, trl, -r requirements.txt
triton==3.2.0 # via liger-kernel, mamba-ssm, torch
trl==0.27.0 # via -r requirements.txt
Comment on lines +189 to +197
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

torch 2.6.0 CUDA driver requirements compatibility

💡 Result:

Short answer: PyTorch (torch) 2.6.0 Linux binaries are built/shipped with CUDA 12.6.3; that CUDA version requires NVIDIA driver 560 or newer. [1][2]

Sources:

  • PyTorch 2.6 release notes (binaries shipped with CUDA 12.6.3). [1]
  • NVIDIA PyTorch release notes / driver requirements (CUDA 12.6.3 → NVIDIA driver 560+). [2]

🌐 Web query:

triton 3.2.0 CUDA requirements

💡 Result:

Triton 3.2.0 requires CUDA 12 (wheels bundle a CUDA 12.4 toolchain); CUDA 11 and older are not supported. The Triton 3.2 series is intended to be used with PyTorch ≥ 2.6 (if using with PyTorch). [1][2]

Sources:

  • Triton Windows / PyPI notes (bundled CUDA 12.4; CUDA 12 required). [1]
  • Libraries.io / PyPI summary (Triton 3.2 → CUDA 12.4; PyTorch ≥2.6). [2]

🌐 Web query:

torch 2.6.0 triton 3.2.0 compatibility issues

💡 Result:

Short answer: PyTorch 2.6.0 expects Triton 3.2.x, but there are several reported compatibility / packaging problems with that pairing (especially on Windows and in some runtime combos).

Key points and evidence

  • PyTorch→Triton mapping: torch 2.6 is intended to work with Triton 3.2.0. [1]
  • Windows packaging: on Windows (and certain Python versions) the bundled "triton-windows" runtime that ships with PyTorch is missing symbols (e.g., AttrsDescriptor) and can cause import failures. This is a known issue. [2]
  • Runtime/autotuner errors: Triton 3.2.0 added autotuner logic that can check torch.cuda.is_available(), which has caused runtime failures with combos like torch 2.6.0 + triton 3.2.0 (e.g., Deepspeed / Ray Train repros). [3]

Practical fixes

  • Use the officially matched combo on Linux (PyTorch 2.6 + triton 3.2 wheel) when possible. [1]
  • On Windows or on Python versions where official Triton wheels are missing, avoid relying on the bundled triton-windows (use Linux or a different Python), or wait for/seek an updated triton-windows that restores missing symbols. [2]
  • If you hit autotuner / import errors, try a temporary workaround: downgrade Triton (or PyTorch) to a known-working pair (e.g., torch 2.5.1) or upgrade PyTorch/Triton to versions where the issue is resolved. [3]

Sources
[1] triton-windows / docs (PyTorch ↔ Triton mapping). (github.com)
[2] GitHub issue: ImportError AttrsDescriptor missing (Windows + torch 2.6.0 + triton). (github.com)
[3] Ray / Deepspeed GitHub issue reporting Triton 3.2.0 autotuner causing failures with torch 2.6.0. (github.com)

If you want, tell me your OS, Python and CUDA versions and the exact error/logs and I’ll suggest the most direct workaround.

Citations:


Test torch 2.6.0 + triton 3.2.0 pairing on your platform; known Windows packaging and DeepSpeed/Ray Train incompatibilities exist.

While torch 2.6.0 and triton 3.2.0 are officially paired (both use CUDA 12.x; driver 560+ required for CUDA 12.6.3), this combination has reported issues:

  • Windows: bundled triton-windows is missing symbols (e.g., AttrsDescriptor), causing import failures.
  • Runtime: triton 3.2.0's autotuner can fail with torch.cuda.is_available() checks in DeepSpeed/Ray Train contexts.

Run a smoke test (import torch, torch.cuda.is_available(), small matmul) on your target platform(s) and Python version(s). On Windows or if you hit autotuner errors, consider downgrading to torch 2.5.1 (with triton 3.1.x) or upgrading both to a newer released pair.

🤖 Prompt for AI Agents
In `@constraints-dev.txt` around lines 189 - 197, The pinned pairing of
torch==2.6.0 and triton==3.2.0 may fail on some platforms (notably Windows) and
in DeepSpeed/Ray Train contexts; reproduce by running a quick smoke test: import
torch, call torch.cuda.is_available(), and run a small CUDA matmul to confirm
imports and GPU functionality. If you encounter import errors (missing symbols
like AttrsDescriptor on Windows) or autotuner/runtime failures, update the
constraints by switching to a known-good pair (e.g., torch==2.5.1 with triton
3.1.x) or bump both to a newer released compatible pair, and document the test
results and chosen fallback in the constraints comment so CI/platform owners can
verify.

types-pyyaml==6.0.12.20250915 # via -r requirements-dev.txt
types-requests==2.32.4.20260107 # via types-tqdm
types-tqdm==4.67.0.20250809 # via -r requirements-dev.txt
typing-extensions==4.15.0 # via aiosignal, anyio, beautifulsoup4, huggingface-hub, ipython, mypy, pydantic, pydantic-core, referencing, torch, typing-inspection, wandb
typing-inspection==0.4.1 # via pydantic
tzdata==2025.2 # via pandas
typing-extensions==4.15.0 # via aiosignal, anyio, beautifulsoup4, grpcio, huggingface-hub, ipython, mypy, pydantic, pydantic-core, referencing, torch, typing-inspection, wandb
typing-inspection==0.4.2 # via pydantic
tzdata==2025.3 # via arrow, pandas
uri-template==1.3.0 # via jsonschema
urllib3==2.5.0 # via requests, sentry-sdk, types-requests
virtualenv==20.34.0 # via pre-commit, tox
wandb==0.21.3 # via -r requirements-dev.txt
wcwidth==0.2.13 # via prompt-toolkit
webcolors==24.11.1 # via jsonschema
urllib3==2.6.3 # via requests, sentry-sdk, types-requests
virtualenv==20.36.1 # via pre-commit, tox
wandb==0.24.0 # via -r requirements-dev.txt
wcwidth==0.2.14 # via prompt-toolkit
webcolors==25.10.0 # via jsonschema
webencodings==0.5.1 # via bleach, tinycss2
websocket-client==1.8.0 # via jupyter-server
werkzeug==3.1.3 # via tensorboard
websocket-client==1.9.0 # via jupyter-server
werkzeug==3.1.5 # via tensorboard
wheel==0.45.1 # via -r requirements.txt
widgetsnbextension==4.0.14 # via ipywidgets
xxhash==3.5.0 # via datasets
yarl==1.20.1 # via aiohttp
widgetsnbextension==4.0.15 # via ipywidgets
xxhash==3.6.0 # via datasets
yarl==1.22.0 # via aiohttp
Loading