-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations #52289
[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations #52289
Conversation
@crassirostris: GitHub didn't allow me to request PR reviews from the following users: igorpeshansky. Note that only kubernetes members can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from me.
@igorpeshansky could you please take a look?
@@ -332,6 +332,17 @@ data: | |||
</metric> | |||
</match> | |||
|
|||
# Trim the entries which exceed slightly less than 100KB, to avoid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add todo to remove this hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
@type record_transformer | ||
enable_ruby true | ||
<record> | ||
log ${record['log'].length > 100000 ? '[The entry was trimmed] ' + record['log'][0:100000 - 25] : record['log']} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to use 0:100000 - 25
because 100000
is already smaller than the 100k limit (102400
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this message break JSON encoded log lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, agree that it's better to keep it clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this message break JSON encoded log lines?
Yes, it will. It is a conscious decision, because the alternative is dropping this JSON entry
if len(log.TextPayload) == originalLength { | ||
return false, fmt.Errorf("got non-trimmed entry of length %d", len(log.TextPayload)) | ||
} | ||
return true, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you'll be documenting the trimmed entry prefix, want to also check that it's being prepended to trimmed entries (i.e., check that the entry starts with [The entry was trimmed]
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, it created a dependency on the string constant between the code and configuration
OTOH, it's better to explicitly change/remove the test instead of debugging the problem b/c of lack of this check
3aaa124
to
a5e8915
Compare
@piosz @igorpeshansky PTAL |
b04ead3
to
7a76ca8
Compare
if len(log.TextPayload) == originalLength { | ||
return false, fmt.Errorf("got non-trimmed entry of length %d", len(log.TextPayload)) | ||
} | ||
if !strings.HasSuffix(log.TextPayload, trimSuffix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shrank the added part and moved it to the end of the message, because then it will clearly signalize that this is not an end of the original message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanded on my previous comment.
@@ -332,6 +332,18 @@ data: | |||
</metric> | |||
</match> | |||
|
|||
# TODO(instrumetation): Reconsider this workaround later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: instrumetation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, thanks
@type record_transformer | ||
enable_ruby true | ||
<record> | ||
log ${record['log'].length > 100000 ? "#{record['log'][0..100000]}[trimmed]" : record['log']} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suffix will cause the whole line to be parsed before we realize it's malformed JSON and revert to text. A prefix would short-circuit that, so it's more efficient. You could add both if it makes for better human readability (e.g., "[Trimmed long entry] {"a": "very long string...
")...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, agree, thanks
7a76ca8
to
cf93209
Compare
@igorpeshansky Done, PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cf93209
to
8cec585
Compare
@@ -383,7 +395,7 @@ data: | |||
num_threads 2 | |||
</match> | |||
metadata: | |||
name: fluentd-gcp-config-v1.1 | |||
name: fluentd-gcp-config-v1.1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not 1.2? the change is pretty significant and I believe just major and minor is enough for config purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks. Kept patch version to be consistent with the semver
8cec585
to
4761bb2
Compare
/lgtm |
/retest |
c9ef369
to
d2ef703
Compare
d2ef703
to
d8525f8
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: crassirostris, piosz Associated issue requirement bypassed by: piosz The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@crassirostris: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375) |
FYI the test modifications broke the e2e test: #52433 |
…-#52289-upstream-release-1.7 Automated cherry pick of #52289
Does this mean the rest of the message, the extra length (exceeding the 100KB) is dropped and not logged? |
It also seems that the stackdriver quota for a logentry size is 256KB not 100KB https://cloud.google.com/logging/quotas |
Stackdriver doesn't support log entries bigger than 100KB, so by default fluentd plugin just drops such entries. To avoid that and increase the visibility of this problem it's suggested to trim long lines instead.
/cc @igorpeshansky