Skip to content

Regex capture groups silently require whitespace #397

@ohmann

Description

@ohmann

Capture groups within the regex(es) used to parse log lines silently require whitespace to precede them. This is undocumented and is only detectable when using the --debugParse flag. In other words, it is likely to confuse users.

Example command:
synoptic.sh -r "(?<ITIME>),(?<ip>),(?<TYPE>)" [... other args ...]

Example debug parse snippet:

INFO: input: (?<ITIME>),(?<ip>),(?<TYPE>) 
INFO: processed: (?:\s*(?<ITIME>\S+)\s*),(?:\s+(?<ip>\S+)\s*),(?:\s+(?<TYPE>\S+)\s*) 
INFO: standard: (?:\s*(\S+)\s*),(?:\s+(\S+)\s*),(?:\s+(\S+)\s*)

Note that the capture group (?<ip>) is transformed into (?:\s+(?<ip>\S+)\s*), i.e., the capture group won't match a log line unless it is preceded by 1+ whitespace characters. This is somewhat deceptive, since the regex the user passed does not reference whitespace at all. The user might think that a line like 123,4.4.4.4,event would be matched, but in fact it will not be.

This only affects default behavior. Manually specifying capture group formatting, e.g., (?<ip>\S+), works as expected and does not silently require whitespace before it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions