Add config source interface and environment variable source for config resolution#640
Add config source interface and environment variable source for config resolution#640ubaskota wants to merge 4 commits intosmithy-lang:config_resolution_mainfrom
Conversation
| config_value = os.environ.get(env_var) | ||
| if config_value is None: | ||
| return None | ||
| return config_value.strip() |
There was a problem hiding this comment.
Alternate option, commit this if you think it's more readable. Functionally, these are equivalent, just does the same thing in two fewer lines
| config_value = os.environ.get(env_var) | |
| if config_value is None: | |
| return None | |
| return config_value.strip() | |
| if config_value := os.environ.get(env_var): | |
| return config_value.strip() |
| :param key: The standard configuration key (e.g., 'region', 'retry_mode'). | ||
| :returns: The value from the corresponding environment variable, or None if not set or empty. |
There was a problem hiding this comment.
This isn't what's actually implemented. Specifically, when we get an empty environment variable, we're still treating it as an empty string.
There was a problem hiding this comment.
I will update the docstring/comment here and below.
| # Empty string should be treated as None | ||
| assert value == "" |
There was a problem hiding this comment.
The comment says we should treating an empty string as None which contradicts your assertion right after
| @@ -0,0 +1,27 @@ | |||
| # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | |||
There was a problem hiding this comment.
nit - The source.py name seems too generic. I think we should either rename this config.py and have all config related interfaces in this file, or be more specific with the path interfaces/config/source.py. The first option is probably my preferred
| """ | ||
| ... | ||
|
|
||
| def get(self, key: str) -> Any | None: |
There was a problem hiding this comment.
Looks like this method definition isn't future proofing us for when we'll need to make async calls to read the AWS config file
There was a problem hiding this comment.
This is intentionally sync. For the config file source, the plan is to read and cache the snapshot of the file during the first access and it’s just an in-memory dict lookup on all subsequent requests. Making this async would require maintaining parallel async/sync Config implementations and forceing users to await simple property access, which significantly complicates both the codebase and the customer interface.
| self._prefix = prefix | ||
|
|
||
| @property | ||
| def name(self) -> str: |
There was a problem hiding this comment.
nit - It feels weird to have a name property and the public SOURCE constant. For a few reasons:
- Which one should we use?
- Why are their names different? (source vs name)
Also, should we be forcing this to be implemented through the protocol? It feels like it. It is already in the protocol
If you decide the keep the constant, it should be private and use Final type hint (ref) so it can't be changed.
from typing import Final
class EnvironmentSource:
_NAME: Final[str] = "environment"
@property
def name(self) -> str:
return self._NAMEThere was a problem hiding this comment.
I will use the name property and remove SOURCE
| """Returns the source name.""" | ||
| return self.SOURCE | ||
|
|
||
| def get(self, key: str) -> Any | None: |
There was a problem hiding this comment.
This will only ever be str | None for environment source right? Since that is a subset of Any | None, the type checker should be okay with us being more specific?
|
|
||
| SOURCE = "environment" | ||
|
|
||
| def __init__(self, prefix: str = "AWS_"): |
There was a problem hiding this comment.
This assumes that all env vars will be prefixed with AWS_ which is the the case for now. However, if we ever need to add a Python specific environment variable, how can we do that in this system?
There was a problem hiding this comment.
First, the prefix is configurable, so it can be changed as needed. If there are ever cases where the variables don't follow the straight forward "SOME_PREFIX" + key.upper() format, we'll add a dictionary in source that maps to those values based on the requirements. For example: "region" could map to "AWS_DEFAULT_REGION". Support for use case like this will be added later based on the need.
| assert source.name == "environment" | ||
|
|
||
| def test_get_region_from_aws_region(self): | ||
| with patch.dict(os.environ, {"AWS_REGION": "us-west-2"}, clear=False): |
There was a problem hiding this comment.
We should be careful with setting clear=False as it can cause environment variables to leak across tests. To make each test deterministic, let's ensure the environment is cleared before the test starts.
This issue came up when we added tests for the environment credential provider in botocore. Tests were leaking env vars and causing other tests to behave unexpectedly.
There was a problem hiding this comment.
+1 We've had issues with this past (see #544). I think all tests should be using clear=True. test_source_does_not_cache_env_vars might be the only exception. In that test we might want to do true then false.
There was a problem hiding this comment.
I see. I will keep that in mind. clear=False just means it won't wipe out the existing environment variables and they will be preserved inside the context. Other values set with patch.dict will remain until inside the context.
There was a problem hiding this comment.
We should still clear the environment vars in each test to keep them isolated. I think we only need to do this for the first patch.dict call in each test to clear the previous values before adding new ones in the test.
| config_value = os.environ.get(env_var) | ||
| if config_value is None: | ||
| return None | ||
| return config_value.strip() |
There was a problem hiding this comment.
I think we should call strip() before checking if the config_value is an empty string.
For example, if a customer provides AWS_REGION=" ", it will return "" instead of None.
If we strip the config value prior to checking emptiness, it would correctly return None.
There was a problem hiding this comment.
We want to use whatever value is set for that environment variable. If a value is set to an empty string, it'll return an empty string. The docstring will be updated.
| source = EnvironmentSource() | ||
| value = source.get("region") | ||
| # Whitespaces should be stripped | ||
| assert value == "" |
There was a problem hiding this comment.
Shouldn't this case return None? This allows the region config value will resolve to an empty string which seems incorrect.
There was a problem hiding this comment.
It will return an empty string. We want to use whatever value is set for that environment variable.
Issue #, if available:
Description of changes:
This change adds the
ConfigSourceprotocol to smithy-coreinterfaces and implementsEnvironmentSourceinsmithy-aws-corefor resolving configuration values from environment variables. Includes unit tests covering protocol compliance, key-to-env-var mapping, edge cases, and no-caching behavior.Added unit tests to validate the expected behavior. Verified that tests pass.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.