-
Notifications
You must be signed in to change notification settings - Fork 77
[confgenerator] Support LoggingProcessorParseMultilineRegex in Otel Logging.
#2103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
3f4322a to
68b4a04
Compare
367d8fa to
6ac6e16
Compare
confgenerator/logging_processors.go
Outdated
|
|
||
| var exprParts []string | ||
| for _, r := range isFirstEntry { | ||
| exprParts = append(exprParts, fmt.Sprintf("body.message matches %q", r)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not build exprParts directly in the first loop and eliminate isFirstEntry altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No real reason. In the other parse_multiline PR, we have a more complicated expressions setup, so it made more sense to do it in steps. I simplified it. Done!
| // TODO: b/459877163 - Update implementation when opentelemetry supports "state-machine" multiline parsing. | ||
| if r.StateName == "start_state" { | ||
| isFirstEntry = append(isFirstEntry, r.Regex) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that by ignoring some states, there will be some possible log inputs that won't parse properly as multiline. (If that's not true, then we should be able to refactor the receivers to only have start_state.)
How do you want to approach testing for that gap? E.g. will you add/change transformation tests later along with b/459877163 to validate whichever edge cases wouldn't work today without a full state machine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the current uses of LoggingProcessorParseMultilineRegex 1 only set two states start_state and cont_state in a simplified manner such that start_state = FirstLogLineRegex and cont_state = Negation of "FirstLogLineRegex" (note : i've just double checked 1, also git grep -C 10 "start_state" helps).
Lines 84 to 94 in ccfedc9
| Rules: []confgenerator.MultilineRule{ | |
| { | |
| StateName: "start_state", | |
| NextState: "cont", | |
| Regex: `^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\.\d{3}\s[A-z]+\s{1,5}`, | |
| }, | |
| { | |
| StateName: "cont", | |
| NextState: "cont", | |
| Regex: `^(?!\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\.\d{3}\s[A-z]+\s{1,5})`, | |
| }, |
This implies all current uses of LoggingProcessorParseMultilineRegex can be fully replicated by only setting "is_first_entry" in otel logging.
Proposed Refactor
We could refactor LoggingProcessorParseMultilineRegex to only be able to set a start_state and then set cont_state programatically as the "negation" of the "start_state". This will enforce this simplified use of multiline features.
How do you want to approach testing for that gap? E.g. will you add/change transformation tests later along with b/459877163 to validate whichever edge cases wouldn't work today without a full state machine?
It depends. What do you think of the Proposed Refactor ?
If we refactor LoggingProcessorParseMultilineRegex to only set a start_state, then there won't be any feature gaps and the 3P app receiver tests are good enough for this.
If we don't refactor it, creating a "transformation_test" would be artificial since we would need to create a "test processor" that uses all the feature of the "state-machine". A "transformation_test" needs a "registered" processor to be able to add to a pipeline.
Footnotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NVM, not "all" 3P app receiver implementations use the simplified set of start_state and cont_state. Though it's only mysql_slow and elasticsearch_json the ones that use a more complicated (not too much) set of regexes.
How do you want to approach testing for that gap? E.g. will you add/change transformation tests later along with b/459877163 to validate whichever edge cases wouldn't work today without a full state machine?
The tests of mysql_slow and elasticsearch_json can serve to compare with the use of all "state-machine" like features.
See draft refactor : dc991f6#diff-7def08d2dee0c2606af18bb82d03c649a7ece83d1fb913fcfb800c6558eb942e
6ac6e16 to
67d3e0e
Compare
Description
Implement
LoggingProcessorParseMultilineRegexandLoggingProcessorParseRegexComplexin Otel Logging.Details
saphanareceiver incorrect use of type "int" which should be "integer".logging-otel-receiver_kafkaconfgenerator test to validate resulting config.Related issue
b/440599473
How has this been tested?
Checklist: