Skip to content

Conversation

@gweaverbiodev
Copy link
Contributor

Rationale for this change

s3fs.S3FileSystem requires that we pass anon directly, it cannot be passed through config_kwargs.

Are these changes tested?

yes, unit tests updated.

Are there any user-facing changes?

no

@gweaverbiodev gweaverbiodev marked this pull request as ready for review August 27, 2025 04:22
@gweaverbiodev gweaverbiodev changed the title [bug fix] pass anon explicitly to s3fs.S3FileSystem use anon argument when configuring s3fs.S3FileSystem Aug 27, 2025
@Fokko
Copy link
Contributor

Fokko commented Aug 27, 2025

@gmweaver Thanks for catching this. Looking at the PyArrow docs, shouldn't be anon be anonymouse instead?

@gweaverbiodev
Copy link
Contributor Author

Looking at the PyArrow docs, shouldn't be anon be anonymouse instead?

s3fs.S3FileSystem uses anon, while Pyarrow uses anonymous. No changes were required for pyarrow since all arguments are passed in like this S3FileSystem(**client_kwargs).

@Fokko
Copy link
Contributor

Fokko commented Aug 27, 2025

@gmweaver Good one, the names of the classes are identical 🤣

@kevinjqliu
Copy link
Contributor

okay i see, we're not passing the configs to S3FileSystem as generic **kwargs but as client_kwargs and config_kwargs
https://github.com/fsspec/s3fs/blob/d46c149d89590b4b9109213f9f92cea01504eb0b/s3fs/core.py#L288-L334

@kevinjqliu
Copy link
Contributor

i think theres actually an underlying bug here.
endpoint_url here is also part of S3FileSystem constructor. I dont think it should be passed via client_kwargs

"endpoint_url": properties.get(S3_ENDPOINT),

@gweaverbiodev
Copy link
Contributor Author

gweaverbiodev commented Aug 27, 2025

i think theres actually an underlying bug here.
endpoint_url here is also part of S3FileSystem constructor. I dont think it should be passed via client_kwargs

endpoint_url can be passed to the constructor or in client_kwargs.

endpoint_url (string (None)) – Use this endpoint_url, if specified. Needed for connecting to non-AWS S3 buckets. Takes precedence over endpoint_url in client_kwargs.

in older versions of s3fs, endpoint_url was not available in the constructor, so client_kwargs may be safer. overall, there is not a 1:1 correspondence between constructor params and valid params you can pass in client_kwargs.

@gweaverbiodev
Copy link
Contributor Author

@kevinjqliu any concerns with the PR based on my last comment? hoping to get this one into 0.10.0 release 😄

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I double checked all the configs and verified that they are set for the right constructors.
Would be great if we can add a link to the 3 different constructors here

@kevinjqliu kevinjqliu merged commit 159b2f3 into apache:main Aug 28, 2025
10 checks passed
@kevinjqliu
Copy link
Contributor

Thanks for the PR @gmweaver great catch! And thanks @Fokko for the review

@kevinjqliu
Copy link
Contributor

we're cutting a new RC for 0.10 so this should be included! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants