-
Notifications
You must be signed in to change notification settings - Fork 0
Enable CHecksum at object finalize #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This pull request introduces a mechanism to enable checksum validation during the finalization of object uploads. By including 'crc32c' and 'md5Hash' in the 'X-Goog-Hash' header for the last uploaded chunk, it ensures data integrity. Highlights: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to add checksum validation for object finalization in storage uploads. The changes correctly identify the last chunk to add the X-Goog-Hash header. However, there are several critical issues. The implementation for parsing checksums from the request body is not robust and can lead to crashes if the body is empty or not valid JSON. Additionally, there are multiple binding.pry statements left in the code, which are debugging tools and must be removed before this can be merged. I've provided suggestions to fix the checksum logic and pointed out the debugging statements that need to be removed.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the functionality to include checksums (crc32c, md5Hash, md5) in the X-Goog-Hash header when finalizing an object upload. It also includes a minor improvement to an error logging message. The core change aims to enhance data integrity verification during uploads.
However, a critical issue has been identified regarding the parsing of the request body for checksum data. The current implementation can lead to a runtime error if the body is empty or malformed, and it performs inefficiently by parsing the same data repeatedly.
| request_header[CONTENT_RANGE_HEADER] = get_content_range_header current_chunk_size | ||
| request_header[CONTENT_LENGTH_HEADER] = current_chunk_size.to_s | ||
| last_chunk= remaining_content_size <= current_chunk_size | ||
| hash_data= JSON.parse(body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The body variable (which is self.body of the command) can be an empty string if no request_object (metadata) is provided. JSON.parse('') will raise a JSON::ParserError, causing the upload to crash. This line needs to safely handle cases where body is empty or not valid JSON.
Furthermore, if the body contains static metadata like checksums, parsing it in every send_upload_command call (which occurs for each chunk) is inefficient. Consider parsing this data once, perhaps in the prepare! method, and storing the formatted_string in an instance variable for reuse.
hash_data = body.to_s.empty? ? {} : JSON.parse(body)|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables checksum validation for storage uploads by adding the X-Goog-Hash header. The changes correctly add the header for both the initial resumable upload request and the final chunk of a multipart upload. The implementation is accompanied by a comprehensive set of tests.
I've identified a few issues:
- A high-severity issue where a debugging dependency (
pry) is included in the test files. - A high-severity issue regarding unhandled JSON parsing errors which could lead to a crash.
- A medium-severity style issue in the code.
- A medium-severity typo in a test description.
Please address these points to improve code quality and robustness.
| require_relative '../../../spec_helper' | ||
| require 'google/apis/core/storage_upload' | ||
| require 'google/apis/core/json_representation' | ||
| require 'pry' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mahendra-google
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. You can modify PR title to follow conventional commit.
thiyaguk09
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request introduces a mechanism to enable checksum validation during the finalization of object uploads. By including 'crc32c' and 'md5Hash' in the 'X-Goog-Hash' header for the last uploaded chunk, it ensures data integrity.
Highlights:
Checksum Validation: Implemented the addition of the 'X-Goog-Hash' header, containing 'crc32c' and 'md5Hash' values, specifically for the final chunk of an object upload to ensure data integrity.