Skip to content

Conversation

@systemcrash
Copy link
Contributor

Exiting is rather pointless if the driver does not recognise the parameter. The driver ignores it anyway (?).

This is intended to resolve a problem where we use nut in openwrt and a startup script does not know what parameters a driver can take in advance (and users might not have read the man page), but execution halts if an unknown parameter is encountered.

Nothing bad happens, and a warning can be printed instead of an error.

See also https://utcc.utoronto.ca/~cks/space/blog/programming/ErrorsShouldRequireFixing

exiting is rather pointless if the driver does not recognise
the parameter. The driver ignores it anyway.

Signed-off-by: Paul Donald <newtwen+github@gmail.com>
@jimklimov
Copy link
Member

Well, this is arguable TBH. If the program is asked to do something it has no idea about, it can at least deduce it won't fulfill the user's explicit requirement. It may be in fact misleading to proceed as if nothing happened for no reason, and the user should fix the config. (We do e.g. ignore user/group ID changes on platforms where they can not be done, but there's a reason for that. And maybe that's not too right too.)

WDYT @aquette @clepple @gdt ?

@gdt
Copy link
Contributor

gdt commented Jan 12, 2026

I lean to continuing to error on bad params.

@AppVeyorBot
Copy link

@jimklimov jimklimov added service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug portability We want NUT to build and run everywhere possible labels Jan 13, 2026
@jimklimov
Copy link
Member

jimklimov commented Jan 13, 2026

Posted on mailing list to gather more opinions about which approach is the lesser evil - to crash, or to pass and so hide user config errors (and welcome strange unexpected behaviours where something does not work as it presumably was told to).

On the technical side, ignoring an unknown option should indeed be same as never receiving it. To me the problem is more of a social one, where somebody explicitly writes a config file leading to/due to certain expectations, and those expectations are not met by a driver version that has no idea about that concept. So by the principle of least surprise, aborting when not knowing how to proceed (and so asking the local expert human for instructions) is the correct way.

@jimklimov jimklimov marked this pull request as draft January 13, 2026 10:58
@kellybyrd
Copy link
Contributor

Another vote for: "stop on unknown params". The system cannot know if the user intended to give a valid newer parameter to an older version or if the user misspelled a parameter that is valid to the current version. IMO, halting on not recognized input is the least surprising thing the NUT can do.

If not, where does this stop? what if config files have options that are not valid with the current release or misspelled?

@systemcrash
Copy link
Contributor Author

All good points. Would a flag be acceptable? E.g. —ignore-unrecognised-params?

@jimklimov
Copy link
Member

That - maybe... It would still be hiding config errors of whatever nature and provenance (typos, version mismatch, ...), but at least would shift the "blame" for that onto end-users or their distro packagers (if the flag is used in init-scripts etc.)

@gdt
Copy link
Contributor

gdt commented Jan 14, 2026

I feel opposed to even a flag. It feels like extra complexity to accomodate what are fundamentally configuration errors, which are perhaps prompted by packaging systems (including "distributions") that have old nut, and an expectation that users will copy/paste/LLM new config that they don't understand.

I'd like to see a far clearer description of the reported problem, and why exiting and reading the logs isn't just as useful as not exiting and reading the logs.

And, the fix for people not reading documentation is to read the documentation :-)

@systemcrash
Copy link
Contributor Author

Sorry for long answer. Thanks for your patience.

Scenario. -FlagA exists in v1. System package manager upgrades to v2. -FlagA no longer exists. Operational downtime because service exits.

I feel opposed to even a flag. It feels like extra complexity to accomodate what are fundamentally configuration errors, which are perhaps prompted by packaging systems (including "distributions") that have old nut, and an expectation that users will copy/paste/LLM new config that they don't understand.

In heterogeneous environments, unknown options happen from version skew, conditional features, or shared configuration files, rarely user error.

I'd like to see a far clearer description of the reported problem, and why exiting and reading the logs isn't just as useful as not exiting and reading the logs.

The user had no working system in the meantime. We have a reported case where the user UPS required a restart. openwrt/luci#8200

And, the fix for people not reading documentation is to read the documentation :-)

I generally live by RTFM. I love good documentation.

Treating unknown configuration options as fatal errors seems to optimise for developer certainty at the expense of operational resilience. Ignoring unknown options with explicit warnings preserves availability, matches common practice in infrastructure software, and avoids unnecessary service failures. In infrastructure software (NUT), availability and uptime is the goal, yes?

Ignoring an unknown option is semantically identical to never having received it. 🤷

Fail-fast is valuable when:

  • Data corruption is possible
  • Partial execution is unsafe
  • Continuing could cause irreversible damage

So a hidden flag that does not show up under a -h invocation would also be valuable.

It's my stance that programs should do their best to recover from errors, and empower users (no pun).

I fully appreciate that everyone in this project is wiser than I, regarding legacy systems and all the problems over the years, so I defer to this generational wisdom.

So by the principle of least surprise, ...

Users are often more surprised by

  • a daemon that refuses to start due to a single irrelevant option
  • a service outage caused by a non-critical config mismatch
  • software that cannot be safely upgraded or downgraded without manual config surgery

In non-homogeneous environments, robust continuation with warnings is often less surprising than abrupt termination.

So all of these points are more relevant than "what if ... options ... with the current release [are] misspelled?"

Most compilers warn on unknown pragmas rather than failing.
Many network daemons ignore unknown config directives with a warning.
POSIX shells ignore unknown environment variables by design.

These systems operate under the same ambiguity, but they anyway choose availability over strictness.

To me the problem is more of a social one

👍 Software deployed in the real world absorbs social complexity, not punishes it.

@gdt
Copy link
Contributor

gdt commented Jan 15, 2026

Scenario. -FlagA exists in v1. System package manager upgrades to v2. -FlagA no longer exists. Operational downtime because service exits.

I was trying to ask about actual observed problems, vs theoretical problems. Deprecations need to be thoughtful, and there's a good case that the old flag should warn for a long time after deprecation and before removal. I would be very sympathetic to, if there were removed command-line or config file directives, to continue to treat the old ones as warnings vs errors, unless the new semantics really break the old intent.

In heterogeneous environments, unknown options happen from version skew, conditional features, or shared configuration files, rarely user error.

Version skew, meaning config for one version when you're running another one (and "shared configuration files" being how you can end up there) are sysadmin errors.

The user had no working system in the meantime. We have a reported case where the user UPS required a restart. openwrt/luci#8200

Thanks. This seems to be a restricted case, where there is mfr/model that some drivers take and others don't. My immediate reaction is that this sort of conditional syntax is not good and we should change to always accept those two args. It's ok for a driver that doesn't care about -mfr to ignore it.

(I don't follow "UPS required a restart". It seems like they found the problem quickly, and with the config file fixed it would have started.)

I see this is an entirely different situation from e.g. upsd getting some random command-line arg.

Treating unknown configuration options as fatal errors seems to optimise for developer certainty at the expense of operational resilience. Ignoring unknown options with explicit warnings preserves availability, matches common practice in infrastructure software, and avoids unnecessary service failures. In infrastructure software (NUT), availability and uptime is the goal, yes?

Reliable operation is the goal. I think we just see it differently, that fail/fix is the better path to reliability than random/ignore. I certainly do not expect random config entries in daemons to result in the system running.

For me, a key point is that when typing I sometimes mispell args, and I'm glad to have an exit vs running without the option I intended to configure. What if there was a default to listen to all addresses, and an arg to make it only localhost, and that were misspelled? (Theoretical I know; nut defaults closed.)

Users are often more surprised by

* a daemon that refuses to start due to a single irrelevant option

* a service outage caused by a non-critical config mismatch

* software that cannot be safely upgraded or downgraded without manual config surgery

They shouldn't be surprised by the first. The third is really about attention to backwards compat in nut development; config that used to be valid should still be ok. And the second is begging the question. When things are wrong, it's only "non-critical" if the obtained semantics are ok vs what was meant.

There's a larger issue in that your system seems to run old NUT. I suppose you intend to keep doing that, but with a change now then in several years you'd have it, even if you're behind then.

@systemcrash
Copy link
Contributor Author

We're at 2.8.4 on master and upcoming... OK thanks for the input.

So does accepting (and ignoring) some of those common usb params seem a better solution?

@kellybyrd
Copy link
Contributor

Thanks for the detailed reply!

The example given doesn't feel like version skew to me. It's this whole "diff drivers may take diff params" thing. The version thing is different IMO, because the OS (OpenWRT) is responsible for its startup scripts making sense with the version of NUT it installs.

The problem here, I think is difference of expectations for use configuration. Diff drivers have diff valid parameters, which I think the NUT project tends to solve by expect the user to have a man page and a text editor. In OpenWRT, you've got a single UI page for all possible drivers.

So, it seems either that LUCI page needs to know about all sets of valid and invalid options for each driver, or the nut driver code can decide to allow the superset of all driver options and ignore ones that aren't relevant to a given driver.

Regarding reliability, because of what NUT and UPS' do, it's an odd situation. It's not like configuring a service the user will use immediately and check function. It's configuring at thing I want to behave in a specific way much later during some rare event. IMO, being strict and careful about input and yelling immediately (by refusing to start) seems like a better options. On the otherhand, maybe I'm wrong? If the driver reports status right away, maybe that counts as "using NUT right after configuring"

@gdt
Copy link
Contributor

gdt commented Jan 15, 2026

+1 to @kellybyrd's comments.

Separately from this discussion, I would think that if you care that daemons are running, then there should be some monitor/complain background process. On my nut system, I wrote a python program to report via mqtt, and the mqtt consumer sends me a message when the reporting goes offline. That happens when OS upgrade resets permissions on the serial port before I fix it again.

A long way of saying "if it's important that nut be running, there should be monitoring that it is ok, and addresssing 1 of N ways it might not run isn't really going to fix the larger issue".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

portability We want NUT to build and run everywhere possible service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants