Fix DoS via malicious HDF5 dataset metadata in KerasFileEditor #21880

HyperPS · 2025-11-29T14:06:01Z

This PR adds validation for extremely-large HDF5 dataset metadata to prevent
remote DoS via HDF5 shape bombs in .keras files. Includes:
Vuln (Reported on Huntr and GHSA)

• Defensive size validation
• Rejection of hostile shapes and dimension overflows
• Recursion path fix for correct inner_path handling
• Avoids unbounded numpy allocation of multi-gigabyte tensors

Backward compatible with valid Keras weight files.

google-cla · 2025-11-29T14:06:06Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

gemini-code-assist · 2025-11-29T14:06:16Z

Summary of Changes

Hello @HyperPS, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical security vulnerability by enhancing the KerasFileEditor to robustly validate HDF5 dataset metadata. It introduces comprehensive checks for malformed shapes, excessively high tensor ranks, and large memory allocations, effectively preventing potential Denial of Service attacks that could arise from specially crafted .keras files. The update ensures safer loading of model weights without compromising backward compatibility.

Highlights

DoS Prevention: Fixes a Denial of Service (DoS) vulnerability related to malicious HDF5 dataset metadata in .keras files, specifically preventing 'HDF5 shape bombs'.
Metadata Validation: Implements defensive size validation, rejection of hostile shapes (e.g., negative dimensions, rank > 64), and checks for dimension overflows in HDF5 datasets.
Memory Allocation Control: Prevents unbounded NumPy allocation of multi-gigabyte tensors by enforcing a maximum memory size limit of 1 GiB for HDF5 datasets during loading.
Recursion Path Fix: Corrects the handling of inner_path during recursive calls within _extract_weights_from_store to ensure proper state management.
Backward Compatibility: Ensures that the implemented security measures are backward compatible with valid Keras weight files.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively addresses a critical Denial of Service vulnerability by introducing comprehensive validation for HDF5 dataset metadata before loading. The changes include checks for hostile shapes, dimension overflows, and excessive memory allocation, which successfully prevents 'shape bomb' attacks. Additionally, the refactoring of the _extract_weights_from_store method significantly improves code clarity and correctness, notably by fixing a bug in the recursive path handling. The overall implementation is robust and enhances the security of file loading operations.

keras/src/saving/file_editor.py

codecov-commenter · 2025-11-29T14:13:12Z

Codecov Report

❌ Patch coverage is 41.66667% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.55%. Comparing base (f2c00fe) to head (376885d).

Files with missing lines	Patch %	Lines
keras/src/saving/file_editor.py	41.66%	9 Missing and 5 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #21880      +/-   ##
==========================================
- Coverage   82.57%   82.55%   -0.02%     
==========================================
  Files         577      577              
  Lines       59599    59620      +21     
  Branches     9351     9355       +4     
==========================================
+ Hits        49213    49220       +7     
- Misses       7978     7987       +9     
- Partials     2408     2413       +5

Flag	Coverage Δ
keras	`82.37% <41.66%> (-0.02%)`	⬇️
keras-jax	`62.86% <41.66%> (-0.02%)`	⬇️
keras-numpy	`57.51% <41.66%> (-0.01%)`	⬇️
keras-openvino	`34.32% <0.00%> (-0.02%)`	⬇️
keras-tensorflow	`64.39% <41.66%> (-0.02%)`	⬇️
keras-torch	`63.56% <41.66%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hertschuh · 2025-12-01T23:10:32Z

@HyperPS

Thank you for the PR!

Vuln (Reported on Huntr and GHSA)

Do you have references/links for these?

HyperPS · 2025-12-01T23:18:53Z

Thanks @hertschuh ,The Huntr report is still private (I reported the issue there), so there’s no public link yet — maintainers can access it via the magic-link email from Huntr. I also submitted the same vulnerability to GHSA, which is currently in the private review queue, so there isn’t a public link for that yet either.

Fix DoS via malicious HDF5 dataset metadata in KerasFileEditor

35f65cc

google-ml-butler bot added the size:M label Nov 29, 2025

google-ml-butler bot assigned gbaned Nov 29, 2025

gemini-code-assist bot reviewed Nov 29, 2025

View reviewed changes

keras/src/saving/file_editor.py Outdated Show resolved Hide resolved

Refactor: move MAX_BYTES constant outside loop per review feedback

376885d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix DoS via malicious HDF5 dataset metadata in KerasFileEditor #21880

Fix DoS via malicious HDF5 dataset metadata in KerasFileEditor #21880

HyperPS commented Nov 29, 2025

Uh oh!

google-cla bot commented Nov 29, 2025

Uh oh!

gemini-code-assist bot commented Nov 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

codecov-commenter commented Nov 29, 2025 •

edited

Loading

Uh oh!

hertschuh commented Dec 1, 2025

Uh oh!

HyperPS commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix DoS via malicious HDF5 dataset metadata in KerasFileEditor #21880

Are you sure you want to change the base?

Fix DoS via malicious HDF5 dataset metadata in KerasFileEditor #21880

Conversation

HyperPS commented Nov 29, 2025

Uh oh!

google-cla bot commented Nov 29, 2025

Uh oh!

gemini-code-assist bot commented Nov 29, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

codecov-commenter commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hertschuh commented Dec 1, 2025

Uh oh!

HyperPS commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Nov 29, 2025 •

edited

Loading