Skip to content

Add config source interface and environment variable source for config resolution#640

Open
ubaskota wants to merge 4 commits intosmithy-lang:config_resolution_mainfrom
ubaskota:config_source
Open

Add config source interface and environment variable source for config resolution#640
ubaskota wants to merge 4 commits intosmithy-lang:config_resolution_mainfrom
ubaskota:config_source

Conversation

@ubaskota
Copy link

Issue #, if available:

Description of changes:
This change adds the ConfigSource protocol to smithy-core interfaces and implements EnvironmentSource in smithy-aws-core for resolving configuration values from environment variables. Includes unit tests covering protocol compliance, key-to-env-var mapping, edge cases, and no-caching behavior.

  • Tests *
    Added unit tests to validate the expected behavior. Verified that tests pass.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@ubaskota ubaskota requested a review from a team as a code owner February 18, 2026 15:57
Comment on lines +32 to +35
config_value = os.environ.get(env_var)
if config_value is None:
return None
return config_value.strip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate option, commit this if you think it's more readable. Functionally, these are equivalent, just does the same thing in two fewer lines

Suggested change
config_value = os.environ.get(env_var)
if config_value is None:
return None
return config_value.strip()
if config_value := os.environ.get(env_var):
return config_value.strip()

:param key: The standard configuration key (e.g., 'region', 'retry_mode').
:returns: The value from the corresponding environment variable, or None if not set or empty.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't what's actually implemented. Specifically, when we get an empty environment variable, we're still treating it as an empty string.

Copy link
Author

@ubaskota ubaskota Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update the docstring/comment here and below.

Comment on lines 35 to 36
# Empty string should be treated as None
assert value == ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says we should treating an empty string as None which contradicts your assertion right after

@@ -0,0 +1,27 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copy link
Contributor

@jonathan343 jonathan343 Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - The source.py name seems too generic. I think we should either rename this config.py and have all config related interfaces in this file, or be more specific with the path interfaces/config/source.py. The first option is probably my preferred

"""
...

def get(self, key: str) -> Any | None:
Copy link
Contributor

@jonathan343 jonathan343 Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this method definition isn't future proofing us for when we'll need to make async calls to read the AWS config file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentionally sync. For the config file source, the plan is to read and cache the snapshot of the file during the first access and it’s just an in-memory dict lookup on all subsequent requests. Making this async would require maintaining parallel async/sync Config implementations and forceing users to await simple property access, which significantly complicates both the codebase and the customer interface.

self._prefix = prefix

@property
def name(self) -> str:
Copy link
Contributor

@jonathan343 jonathan343 Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - It feels weird to have a name property and the public SOURCE constant. For a few reasons:

  • Which one should we use?
  • Why are their names different? (source vs name)

Also, should we be forcing this to be implemented through the protocol? It feels like it. It is already in the protocol

If you decide the keep the constant, it should be private and use Final type hint (ref) so it can't be changed.

from typing import Final

class EnvironmentSource:
    _NAME: Final[str] = "environment"

    @property
    def name(self) -> str:
        return self._NAME

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use the name property and remove SOURCE

"""Returns the source name."""
return self.SOURCE

def get(self, key: str) -> Any | None:
Copy link
Contributor

@jonathan343 jonathan343 Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only ever be str | None for environment source right? Since that is a subset of Any | None, the type checker should be okay with us being more specific?


SOURCE = "environment"

def __init__(self, prefix: str = "AWS_"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that all env vars will be prefixed with AWS_ which is the the case for now. However, if we ever need to add a Python specific environment variable, how can we do that in this system?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, the prefix is configurable, so it can be changed as needed. If there are ever cases where the variables don't follow the straight forward "SOME_PREFIX" + key.upper() format, we'll add a dictionary in source that maps to those values based on the requirements. For example: "region" could map to "AWS_DEFAULT_REGION". Support for use case like this will be added later based on the need.

assert source.name == "environment"

def test_get_region_from_aws_region(self):
with patch.dict(os.environ, {"AWS_REGION": "us-west-2"}, clear=False):
Copy link
Contributor

@arandito arandito Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be careful with setting clear=False as it can cause environment variables to leak across tests. To make each test deterministic, let's ensure the environment is cleared before the test starts.

This issue came up when we added tests for the environment credential provider in botocore. Tests were leaking env vars and causing other tests to behave unexpectedly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 We've had issues with this past (see #544). I think all tests should be using clear=True. test_source_does_not_cache_env_vars might be the only exception. In that test we might want to do true then false.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will keep that in mind. clear=False just means it won't wipe out the existing environment variables and they will be preserved inside the context. Other values set with patch.dict will remain until inside the context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still clear the environment vars in each test to keep them isolated. I think we only need to do this for the first patch.dict call in each test to clear the previous values before adding new ones in the test.

config_value = os.environ.get(env_var)
if config_value is None:
return None
return config_value.strip()
Copy link
Contributor

@arandito arandito Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call strip() before checking if the config_value is an empty string.

For example, if a customer provides AWS_REGION=" ", it will return "" instead of None.

If we strip the config value prior to checking emptiness, it would correctly return None.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to use whatever value is set for that environment variable. If a value is set to an empty string, it'll return an empty string. The docstring will be updated.

source = EnvironmentSource()
value = source.get("region")
# Whitespaces should be stripped
assert value == ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this case return None? This allows the region config value will resolve to an empty string which seems incorrect.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will return an empty string. We want to use whatever value is set for that environment variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments