Skip to content

Conversation

@WZhuo
Copy link
Contributor

@WZhuo WZhuo commented Dec 31, 2025

No description provided.

@WZhuo WZhuo force-pushed the location_provider branch from 1aee659 to 932bd2c Compare December 31, 2025 05:56
@WZhuo WZhuo closed this Dec 31, 2025
@WZhuo WZhuo force-pushed the location_provider branch from 932bd2c to d855a40 Compare December 31, 2025 05:56
@WZhuo WZhuo reopened this Dec 31, 2025
/// \param filename a file name
/// \return a fully-qualified location URI for a data file
///
/// TODO(wgtmac): StructLike is not well thought yet, we may wrap an ArrowArray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update the comment.

/// \param input_location the table location
/// \param properties the table properties
/// \return a LocationProvider instance
static Result<std::unique_ptr<LocationProvider>> For(const std::string& input_location,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Make to be consistent with other APIs.

.value_or("invalid literal of type decimal");
}
case TypeId::kString: {
// TODO(zhuo.wang): escape string?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO ToString is for human readable representation. So it is no need to escape it here?

/// system.
class ICEBERG_EXPORT DefaultLocationProvider : public LocationProvider {
public:
DefaultLocationProvider(const std::string& table_location,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use std::string_view for table_location?


Result<std::string> NewDataLocation(const PartitionSpec& spec,
const PartitionValues& partition_data,
const std::string& filename) override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question for filename to use string_view in both NewDataLocation functions. We can change the base class function signature if appropriate.

std::string DirsFromHash(int32_t hash) {
std::string hash_with_dirs;

for (int i = 0; i < kEntropyDirDepth * kEntropyDirLength; i += kEntropyDirLength) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int i = 0; i < kEntropyDirDepth * kEntropyDirLength; i += kEntropyDirLength) {
for (int32_t i = 0; i < kEntropyDirDepth * kEntropyDirLength; i += kEntropyDirLength) {

}

std::string PathContext(std::string_view table_location) {
std::string path = LocationUtil::StripTrailingSlash(table_location);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If StripTrailingSlash returns a string_view, we can avoid creating a string object here.


size_t last_slash = path.find_last_of('/');
if (last_slash != std::string::npos && last_slash < path.length() - 1) {
std::string_view data_path(path.data(), path.size() - last_slash - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string_view data_path(path.data(), path.size() - last_slash - 1);
std::string_view data_path = std::string_view(path).substr(last_slash + 1);


/// \brief DefaultLocationProvider privides default location provider for local file
/// system.
class ICEBERG_EXPORT DefaultLocationProvider : public LocationProvider {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we can move these impl classes to location_provider.cc without declaring them here.

Result<std::unique_ptr<StructType>> PartitionType(const Schema& schema) const;

/// \brief Get the partition path for the given partition data.
Result<std::string> PartitionPath(const PartitionValues& data) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some test cases for this? This function can also be in a separate PR if you see fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants