-
Notifications
You must be signed in to change notification settings - Fork 77
feat: Impl LocationProvider #467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1aee659 to
932bd2c
Compare
932bd2c to
d855a40
Compare
src/iceberg/location_provider.h
Outdated
| /// \param filename a file name | ||
| /// \return a fully-qualified location URI for a data file | ||
| /// | ||
| /// TODO(wgtmac): StructLike is not well thought yet, we may wrap an ArrowArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update the comment.
src/iceberg/location_provider.h
Outdated
| /// \param input_location the table location | ||
| /// \param properties the table properties | ||
| /// \return a LocationProvider instance | ||
| static Result<std::unique_ptr<LocationProvider>> For(const std::string& input_location, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Make to be consistent with other APIs.
src/iceberg/expression/literal.cc
Outdated
| .value_or("invalid literal of type decimal"); | ||
| } | ||
| case TypeId::kString: { | ||
| // TODO(zhuo.wang): escape string? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO ToString is for human readable representation. So it is no need to escape it here?
src/iceberg/location_provider.h
Outdated
| /// system. | ||
| class ICEBERG_EXPORT DefaultLocationProvider : public LocationProvider { | ||
| public: | ||
| DefaultLocationProvider(const std::string& table_location, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use std::string_view for table_location?
src/iceberg/location_provider.h
Outdated
|
|
||
| Result<std::string> NewDataLocation(const PartitionSpec& spec, | ||
| const PartitionValues& partition_data, | ||
| const std::string& filename) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question for filename to use string_view in both NewDataLocation functions. We can change the base class function signature if appropriate.
src/iceberg/location_provider.cc
Outdated
| std::string DirsFromHash(int32_t hash) { | ||
| std::string hash_with_dirs; | ||
|
|
||
| for (int i = 0; i < kEntropyDirDepth * kEntropyDirLength; i += kEntropyDirLength) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for (int i = 0; i < kEntropyDirDepth * kEntropyDirLength; i += kEntropyDirLength) { | |
| for (int32_t i = 0; i < kEntropyDirDepth * kEntropyDirLength; i += kEntropyDirLength) { |
src/iceberg/location_provider.cc
Outdated
| } | ||
|
|
||
| std::string PathContext(std::string_view table_location) { | ||
| std::string path = LocationUtil::StripTrailingSlash(table_location); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If StripTrailingSlash returns a string_view, we can avoid creating a string object here.
src/iceberg/location_provider.cc
Outdated
|
|
||
| size_t last_slash = path.find_last_of('/'); | ||
| if (last_slash != std::string::npos && last_slash < path.length() - 1) { | ||
| std::string_view data_path(path.data(), path.size() - last_slash - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| std::string_view data_path(path.data(), path.size() - last_slash - 1); | |
| std::string_view data_path = std::string_view(path).substr(last_slash + 1); |
src/iceberg/location_provider.h
Outdated
|
|
||
| /// \brief DefaultLocationProvider privides default location provider for local file | ||
| /// system. | ||
| class ICEBERG_EXPORT DefaultLocationProvider : public LocationProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we can move these impl classes to location_provider.cc without declaring them here.
| Result<std::unique_ptr<StructType>> PartitionType(const Schema& schema) const; | ||
|
|
||
| /// \brief Get the partition path for the given partition data. | ||
| Result<std::string> PartitionPath(const PartitionValues& data) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some test cases for this? This function can also be in a separate PR if you see fit.
No description provided.