Skip to content

Conversation

@shubhangi-google
Copy link
Owner

@shubhangi-google shubhangi-google commented Dec 3, 2025

This pull request introduces a mechanism to enable checksum validation during the finalization of object uploads. By including 'crc32c' and 'md5Hash' in the 'X-Goog-Hash' header for the last uploaded chunk, it ensures data integrity.

Highlights:
Checksum Validation: Implemented the addition of the 'X-Goog-Hash' header, containing 'crc32c' and 'md5Hash' values, specifically for the final chunk of an object upload to ensure data integrity.

@gemini-code-assist
Copy link

gemini-code-assist bot commented Dec 3, 2025

This pull request introduces a mechanism to enable checksum validation during the finalization of object uploads. By including 'crc32c' and 'md5Hash' in the 'X-Goog-Hash' header for the last uploaded chunk, it ensures data integrity.

Highlights:
Checksum Validation: Implemented the addition of the 'X-Goog-Hash' header, containing 'crc32c' and 'md5Hash' values, specifically for the final chunk of an object upload to ensure data integrity.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to add checksum validation for object finalization in storage uploads. The changes correctly identify the last chunk to add the X-Goog-Hash header. However, there are several critical issues. The implementation for parsing checksums from the request body is not robust and can lead to crashes if the body is empty or not valid JSON. Additionally, there are multiple binding.pry statements left in the code, which are debugging tools and must be removed before this can be merged. I've provided suggestions to fix the checksum logic and pointed out the debugging statements that need to be removed.

@shubhangi-google
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the functionality to include checksums (crc32c, md5Hash, md5) in the X-Goog-Hash header when finalizing an object upload. It also includes a minor improvement to an error logging message. The core change aims to enhance data integrity verification during uploads.

However, a critical issue has been identified regarding the parsing of the request body for checksum data. The current implementation can lead to a runtime error if the body is empty or malformed, and it performs inefficiently by parsing the same data repeatedly.

request_header[CONTENT_RANGE_HEADER] = get_content_range_header current_chunk_size
request_header[CONTENT_LENGTH_HEADER] = current_chunk_size.to_s
last_chunk= remaining_content_size <= current_chunk_size
hash_data= JSON.parse(body)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The body variable (which is self.body of the command) can be an empty string if no request_object (metadata) is provided. JSON.parse('') will raise a JSON::ParserError, causing the upload to crash. This line needs to safely handle cases where body is empty or not valid JSON.

Furthermore, if the body contains static metadata like checksums, parsing it in every send_upload_command call (which occurs for each chunk) is inefficient. Consider parsing this data once, perhaps in the prepare! method, and storing the formatted_string in an instance variable for reuse.

hash_data = body.to_s.empty? ? {} : JSON.parse(body)

@shubhangi-google
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables checksum validation for storage uploads by adding the X-Goog-Hash header. The changes correctly add the header for both the initial resumable upload request and the final chunk of a multipart upload. The implementation is accompanied by a comprehensive set of tests.

I've identified a few issues:

  • A high-severity issue where a debugging dependency (pry) is included in the test files.
  • A high-severity issue regarding unhandled JSON parsing errors which could lead to a crash.
  • A medium-severity style issue in the code.
  • A medium-severity typo in a test description.

Please address these points to improve code quality and robustness.

require_relative '../../../spec_helper'
require 'google/apis/core/storage_upload'
require 'google/apis/core/json_representation'
require 'pry'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The pry gem is required in this test file. This is a debugging tool and should be removed before merging to avoid adding unnecessary dependencies to the project.

Copy link
Collaborator

@mahendra-google mahendra-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. You can modify PR title to follow conventional commit.

Copy link
Collaborator

@thiyaguk09 thiyaguk09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants