Environment Sandboxes #1676
Replies: 2 comments 3 replies
-
|
The current behaviour seems (almost) exactly as I would expect it. Non-workers should always operate in a sandboxed mode where putenv and getenv are specific to the current request. Or more specifically, the current php execution. Therefore I'd expect workers to behave the same way they do now, except, if I understood you correctly, it should not be possible to influence the environment of a different worker. Unless the second worker you were talking about is a different thread of the same worker as worker 1. # one.domain.tld
php {
worker index.php 2
}
# two
php {
worker other.php
}// worker one thread 1
putenv("FOO=BAR");
// worker one thread 2
putenv("FOO=BAZ");
// worker two
putenv("FOO=BAL");
// worker one thread 1 later
getenv("FOO"); //expect BAZ |
Beta Was this translation helpful? Give feedback.
-
|
The environment resetting between requests is actually how FPM behaves. I also thought this was weird when I first encountered it, but it's how PHP has always worked. It's probably not a good idea to change this behavior in CGI mode now. For worker mode I think it makes sense to no reset the environment since global and static variables persist across requests anyways. The only unfortunate thing about the current setup is that putenv can potentially have race conditions since it still needs to affect the actual environment. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Current Behaviour
When trying to address #1674, I noticed a few inconsistencies, and I’m not sure what the desired behaviour is. This is a discussion on what that desired behaviour might be and how to implement it.
Before we get to worker mode, we need to discuss cgi mode. If you run the following script:
You would expect to see the following output on the first request:
However, we continue to see this exact same output on any further request because as soon as we begin to wait for the next request, we clear out the sandbox env.
From 20+ years of writing software, this is very surprising behaviour. If I set the environment, it would be expected to see that value later on. This can be even more surprising with the following script:
Then we see the following behaviour:
To make matters more confusing, worker mode behaves differently from cgi mode, using the same behaviour as above:
In #1674 (comment), @AlliBalliBaba mentions that this is potentially expected behaviour. I believe this works like this to allow for the situation where random threads serve multiple sites with potentially different environments (such as a hosting context). I would like to propose a new method for managing environments, but a) want to document how we expect it to work, and b) make sure we’re aligned.
New behaviour for environment variables
For the most part, things will look quite similar: each thread will have an environment sandbox; however, the sandbox will be assigned via an optional request option:
withEnvironmentSandbox(hostedSandbox)wherehostedSandboxis a pointer to a shared sandbox (sync.Map) perphp_server/phpdirective. This means it is potentially possible for workers and nonworkers to share a sandbox. However, multiple routes are usually defined with their own directives.How it works
In Caddy
Upon startup, a copy is made of the current environment into a "sandbox". Any
envdirectives are applied after copying the env, allowing operators to "override" existing environment variables. This is most likely to be useful in embedded application cases where the operator wants to ensure customers/clients can’t override certain variables; accidentally or intentionally.When a request is received from a client, we pass along the
withEnvironmentSandboxoption, containing a pointer to the correct sandbox. On initialisation of workers, we also pass on this option.In FrankenPHP
When initialising workers, we apply the appropriate sandbox if one is given. Otherwise, we bypass the sandbox and access the raw environment.
When receiving a request to a nonworker, the thread accesses the option and sets its environment sandbox appropriately, or removes its current sandbox if none is given, it will access the raw environment.
Calling
putenvwill also set the raw environment.putenv, setenv, $_ENV, and $_SERVER
In all cases, the environment sandbox will be accessed for filling super globals and accessing the environment, unless a sandbox is not provided, in which case it will access the raw environment. It is up to go authors (including our Caddy Module) to provide this sandbox and fill it with initial values (we can provide a helper here).
Visible Effects
This should make environments more logically consistent. Applications may still be able to notice inconsistencies between threads by calling into the shell just like they can today. For example, imagine this interleaving scenario:
There really isn’t a way around this, though, and it has been possible since the beginning. However, it is possible to mitigate this by manually setting important environment variables in the call:
Definition
Breaking Changes
This will be a breaking change for anyone relying on the current behaviour.
cc: @henderkes @dunglas @AlliBalliBaba
Beta Was this translation helpful? Give feedback.
All reactions