-
Notifications
You must be signed in to change notification settings - Fork 471
Add unified port Property #5966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Is there an example for how a user would configure ports for the individual servers? There is a comment in #5845 about setting this in |
Yea I think you're right. The port range prop can of course be set when manually starting a component but I don't think it can be added to accumulo-env.sh without significant changes at least. One way that seems like it could work well is to add a port range field to cluster.yaml. That way we could specify the port range per component type. |
I guess another way would be to create a new variable that can be added to the end of accumulo/assemble/bin/accumulo Line 89 in 8177db1
This would allow that variable to be set in |
Yea that would work too. Thinking through adding it to cluster.yaml, it might look something like this: So we would define a range per component type per resource group. Not sure that we want/need that kind of fine grained control or if we just want ranges per component type regardless of RG. |
|
I think the port range needs to be able to be specified at the server type level. The Accumulo client needs to be able to interact with the Manager, ScanServer, and TabletServer processes. The client doesn't interact with the Compactor, GarbageCollector, or Monitor. There is no good way to configure a firewall to only allow clients to talk with the Manager, ScanServer, and TabletServer with a single port range being used for all processes. Edit: I re-read your example, and that might work. |
|
Moving the range property to the service definition inside If adding this property to If each file has to contain distinct property definitions then the existing |
|
I think the port range property that is sent down to For example, if |
As mentioned, may be able to do this already in accumulo-env.sh using |
We currently can't do this in accumulo-env.sh because all of the JAVA_OPTS are set before calling start.Main. accumulo/assemble/bin/accumulo Line 89 in b29f014
We used to support EXTRA_ARGS by accumulo/assemble/bin/accumulo-cluster Lines 166 to 167 in 8f52dd1
But that was removed in 2.1.4 @dlmarion is right, we would need to create another var that gets passed to accumulo-service and then added to the end of the |
|
In aa5b11a I made it so we can set per-server-component-type properties in I tested with the following in and it all seemed to work as expected: |
assemble/bin/accumulo
Outdated
| # Allow users to supply extra Accumulo arguments via ACCUMULO_MAIN_ARGS | ||
| if ! declare -p ACCUMULO_MAIN_ARGS &>/dev/null; then | ||
| ACCUMULO_MAIN_ARGS=() | ||
| else | ||
| declare_output=$(declare -p ACCUMULO_MAIN_ARGS 2>/dev/null) | ||
| if [[ $declare_output != declare\ -a* ]]; then | ||
| ACCUMULO_MAIN_ARGS_RAW=${ACCUMULO_MAIN_ARGS[*]} | ||
| read -r -a ACCUMULO_MAIN_ARGS <<<"${ACCUMULO_MAIN_ARGS_RAW}" | ||
| fi | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why this is handled differently than how JAVA_OPTS is handled, is there a reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I have a couple of thoughts on this:
- This is making me really regret that I didn't push harder against the
-oproperty passing. My original proposal when we reworked the launch scripts was to support Java system properties for accumulo configs, as in:-Daccumulo.rpc.bind.port=9999. I don't know the reason Mike implemented the-o, but it is really making it inconvenient here. If we could just pass them as system properties, then this would be a non-issue. And, I still think that's the better implementation. - The default value should be a range that servers of any type can share. So, most of the time, users don't need to specify anything in their config files.
- If one is going to specify properties in config files, we don't need any of this, because we can inject properties into the commons-configuration file using property interpolation. If a user wants to set a custom port, they can either use a shared
accumulo.propertiesfile that has an entry likerpc.bind.port=${env.RPC_PORT}and then doexport RPC_PORT=9999inaccumulo-env.shbased on the server type, or they can use something likerpc.bind.port=${sys:accumulo.rpc.bind.port}and set-Daccumulo.rpc.bind.port=9999in theirJAVA_OPTSbased on the server type, OR they can import server-type-specific properties using includes that pull in a specific config for a specific server type, and they can spread their config over multiple config files, OR they can choose different wholly independent config files, by adding-Daccumulo.properties=/path/to/server-specific.propertiesin theiraccumulo-env.shbased on the server type.
So, we just don't need this. We already have better options than making the scripts more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding point # 3, it would be useful to have an example of how the includes would work. If you understand commons-configuration and how it works, then it's obvious. If you don't, then it's not obvious and the user is unlikely to even think about it as a viable option.
core/src/main/java/org/apache/accumulo/core/conf/AccumuloConfiguration.java
Outdated
Show resolved
Hide resolved
| RPC_BIND_PORT("rpc.bind.port", "19000-19999", PropertyType.PORT, | ||
| """ | ||
| The port or range of ports servers attempt to bind for RPC traffic. Provide a single \ | ||
| value to target an exact port (will attempt higher ports if given port is already in use, \ | ||
| up to 1000 additional checks), or a range using formats like '19000-19999' to allow searching for \ | ||
| the first available port within that range. | ||
| """, | ||
| "4.0.0"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will attempt higher ports if given port is already in use, up to 1000 additional checks
I don't think we should do this. If the user specifies a single port, then we should honor it. If they want to use port search, then they should specify a range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This was one of the things in the original ticket:
A single integer value,
A, should be interpreted as the range[A, A+1), and implies no port searching.... use that one value or fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in d6d8553. Only ports in the given range will be used.
assemble/bin/accumulo
Outdated
| # Allow users to supply extra Accumulo arguments via ACCUMULO_MAIN_ARGS | ||
| if ! declare -p ACCUMULO_MAIN_ARGS &>/dev/null; then | ||
| ACCUMULO_MAIN_ARGS=() | ||
| else | ||
| declare_output=$(declare -p ACCUMULO_MAIN_ARGS 2>/dev/null) | ||
| if [[ $declare_output != declare\ -a* ]]; then | ||
| ACCUMULO_MAIN_ARGS_RAW=${ACCUMULO_MAIN_ARGS[*]} | ||
| read -r -a ACCUMULO_MAIN_ARGS <<<"${ACCUMULO_MAIN_ARGS_RAW}" | ||
| fi | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I have a couple of thoughts on this:
- This is making me really regret that I didn't push harder against the
-oproperty passing. My original proposal when we reworked the launch scripts was to support Java system properties for accumulo configs, as in:-Daccumulo.rpc.bind.port=9999. I don't know the reason Mike implemented the-o, but it is really making it inconvenient here. If we could just pass them as system properties, then this would be a non-issue. And, I still think that's the better implementation. - The default value should be a range that servers of any type can share. So, most of the time, users don't need to specify anything in their config files.
- If one is going to specify properties in config files, we don't need any of this, because we can inject properties into the commons-configuration file using property interpolation. If a user wants to set a custom port, they can either use a shared
accumulo.propertiesfile that has an entry likerpc.bind.port=${env.RPC_PORT}and then doexport RPC_PORT=9999inaccumulo-env.shbased on the server type, or they can use something likerpc.bind.port=${sys:accumulo.rpc.bind.port}and set-Daccumulo.rpc.bind.port=9999in theirJAVA_OPTSbased on the server type, OR they can import server-type-specific properties using includes that pull in a specific config for a specific server type, and they can spread their config over multiple config files, OR they can choose different wholly independent config files, by adding-Daccumulo.properties=/path/to/server-specific.propertiesin theiraccumulo-env.shbased on the server type.
So, we just don't need this. We already have better options than making the scripts more complicated.
| MANAGER_CLIENTPORT("manager.port.client", "9999", PropertyType.PORT, | ||
| "The port used for handling client connections on the manager.", "1.3.5"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a good idea to keep these around and deprecated. If the new property is set, then we should use the new behavior and ignore anything in the old properties. Otherwise, we should read the old properties and use them.
Ideally, we'd introduce this in 3.1 and deprecate them there, so we don't need to have these in 4.0 at all.
Because this is a much simpler config, that hopefully users just don't have to worry about much because the defaults are reasonable, I don't think it would be completely terrible if we just changed this without a deprecation, but we would definitely need to call that out in the release notes.
| if (input == null) { | ||
| return true; | ||
| } | ||
| final String trimmed = input.trim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're trimming for validation, we need to trim when we use it. It might make sense here, but I think it would be novel. Looking at the code, we only trim the input in PropertyType in two other places, and one might be a bug.
- We trim when we strip the units off of the end of bounded units, so we can isolate the numeric part, because we're only testing the number part.
- We trim absolute paths during validation (might be a bug... seems sketchy at a glance)
In general, we assume that the property value is exactly what the user specified, including any spaces. I don't know that we should always do that, but it would be nice if we were consistent, and it doesn't look like we do anything to transform the user's input by stripping off whitespace in most of the other property types, so we probably shouldn't here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the use of .trim in 7375e2d. Added a return false to PortPredicate() if the value contains any whitespace in that commit too. That will cause PropertyType.PORT.isValidFormat() to return false before the whitespace causes issues further down the line.
| if ("0".equals(trimmed)) { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is 0 valid here? It seems like it shouldn't be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several existing places in the code where we set "0" as the value for the port:
accumulo/minicluster/src/main/java/org/apache/accumulo/miniclusterImpl/MiniAccumuloConfigImpl.java
Lines 292 to 296 in 4c259d7
| private void mergePropWithRandomPort(String key) { | |
| if (!siteConfig.containsKey(key)) { | |
| siteConfig.put(key, "0"); | |
| } | |
| } |
This is used for all the
accumulo/minicluster/src/main/java/org/apache/accumulo/miniclusterImpl/MiniAccumuloConfigImpl.java
Lines 186 to 189 in 4c259d7
| mergePropWithRandomPort(Property.MANAGER_CLIENTPORT.getKey()); | |
| mergePropWithRandomPort(Property.TSERV_CLIENTPORT.getKey()); | |
| mergePropWithRandomPort(Property.MONITOR_PORT.getKey()); | |
| mergePropWithRandomPort(Property.GC_PORT.getKey()); |
It looks like eventually the prop gets used in the server builder code to create a new InetSocketAddress(host, port) whose javadoc says "A valid port value is between 0 and 65535. A port number of zero will let the system pick up an ephemeral port in a bind operation."
It would be a change in functionality to disallow "0".
There are several other places as well as tests and ITs that set 0 for the port. Maybe we could document this better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the docs for RPC_BIND_PORT in d6d8553
|
Here are some notes on what I have tried: env interpolationIn One thing to note is the syspropSimilar to the env interpolation. Set commons configuration
|
Fixes #5845
rpc.bind.portProperty for their port candidate list. This new prop has a default value of19000-19999. Individual port props have been removed (*_CLIENTPORTproperties)M-NAll the changed tests and ITs pass and so does the sunny maven profile.