+
Skip to content

Conversation

haircommander
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

We want a way to judging how long the actual exec process is running in a container, to be totally fair in how we account for the exec. Unfortunately, there is not a great signal other than pid file creation (which still is a bit early). This improves the time the actual container process gets for exec by about 10ms

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

fix a bug where a container exec process received a little less time than the timeout provided

@openshift-ci openshift-ci bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. labels Jul 9, 2021
@openshift-ci openshift-ci bot requested review from kolyshkin and vrothberg July 9, 2021 19:55
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 9, 2021
@haircommander
Copy link
Member Author

/cherry-pick release-1.21

@openshift-cherrypick-robot

@haircommander: once the present PR merges, I will cherry-pick it on top of release-1.21 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@haircommander
Copy link
Member Author

/cherry-pick release-1.20

@openshift-cherrypick-robot

@haircommander: once the present PR merges, I will cherry-pick it on top of release-1.20 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@haircommander
Copy link
Member Author

/cherry-pick release-1.19

@openshift-cherrypick-robot

@haircommander: once the present PR merges, I will cherry-pick it on top of release-1.19 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@codecov
Copy link

codecov bot commented Jul 9, 2021

Codecov Report

Merging #5067 (3724bad) into master (0013dcd) will increase coverage by 3.78%.
The diff coverage is 36.16%.

❗ Current head 3724bad differs from pull request most recent head 2297dba. Consider uploading reports for the commit 2297dba to get more accurate results

@@            Coverage Diff             @@
##           master    #5067      +/-   ##
==========================================
+ Coverage   40.15%   43.93%   +3.78%     
==========================================
  Files         115      109       -6     
  Lines        9317    11349    +2032     
==========================================
+ Hits         3741     4986    +1245     
- Misses       5157     5888     +731     
- Partials      419      475      +56     

utils/notify.go Outdated
}
go func() {
defer notify.Stop(c)
for {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S1000: should use for range instead of for { select {} }
(at-me in a reply with help or ignore)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u right

@haircommander
Copy link
Member Author

I've opted to move to github.com/rjeczalik/notify instead of fsnotify, partially because it seems more active, and also because I was having trouble with the unit tests and the accuracy of the cross-platform events. As a result of the new package, I have been able to shave off another ~30 ms, leaving the runc overhead at about 4ms on my machine, as opposed to 50 ms before this PR

@haircommander
Copy link
Member Author

I also had the idea to test what the behavior was like when conmon was running execs.

turns out, conmon was very generous to container processes--giving somewhere between 40ms and 50ms over the timeout:
sudo crictl exec --sync --timeout 1 8d sleep 1.05

in other words, this PR changes us from giving 100ms less from pre-conmonless-exec days to about 50ms less. BUT we are technically closer to the actual time the container process is execing

@haircommander haircommander force-pushed the exec-inotify branch 4 times, most recently from c0651c0 to a484419 Compare July 14, 2021 15:40
@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

dropping a484419 for now to see if that broke stuff

@haircommander
Copy link
Member Author

/retest-required

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, just a few nits

return filepath.Join(c.bundlePath, "conmon-pidfile")
}

func WatchForFile(ctx context.Context, path string, opsToWatch []notify.Event) chan error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function can be private

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not if we want to unit test it, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, yeah right.

return filepath.Join(c.bundlePath, "conmon-pidfile")
}

func WatchForFile(ctx context.Context, path string, opsToWatch []notify.Event) chan error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func WatchForFile(ctx context.Context, path string, opsToWatch []notify.Event) chan error {
func WatchForFile(ctx context.Context, path string, opsToWatch ...notify.Event) chan error {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo ctx is unused

go func() {
defer notify.Stop(c)
for ei := range c {
fmt.Println(ei)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo gotta drop this

@haircommander haircommander force-pushed the exec-inotify branch 2 times, most recently from 4454e34 to 85aa875 Compare July 15, 2021 20:23
Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 16, 2021
@saschagrunert
Copy link
Member

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 16, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [haircommander,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@haircommander
Copy link
Member Author

time="2021-07-16T10:18:11Z" level=error msg="Error copying docker://quay.io/crio/oom:latest to dir:/go/src/github.com/cri-o/cri-o/.artifacts/oom-image: Error reading blob sha256:131fcea386e736dfaf075424d7e2a31ae8a7fcbb9a87ba93c8a70edd120b
Error pulling docker://quay.io/crio/oom:latest

quay flake
/retest

@haircommander
Copy link
Member Author

/retest

Signed-off-by: Peter Hunt <pehunt@redhat.com>
to be more reusable

Signed-off-by: Peter Hunt <pehunt@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Be careful about leaking anything

Signed-off-by: Peter Hunt <pehunt@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 16, 2021
@haircommander
Copy link
Member Author

I dropped the bats test, because it was flaky, and I added some code to make sure we're not leaking channels or routines.

@haircommander
Copy link
Member Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 16, 2021

@haircommander: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/openshift-jenkins/e2e_crun_cgroupv2 2297dba link /test e2e_cgroupv2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mrunalp
Copy link
Member

mrunalp commented Jul 16, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 16, 2021
@openshift-merge-robot openshift-merge-robot merged commit cd1c5a0 into cri-o:master Jul 16, 2021
@openshift-cherrypick-robot

@haircommander: #5067 failed to apply on top of branch "release-1.21":

Applying: vendor: add notify package
Using index info to reconstruct a base tree...
M	go.mod
M	go.sum
M	vendor/modules.txt
Falling back to patching base and 3-way merge...
Auto-merging vendor/modules.txt
CONFLICT (content): Merge conflict in vendor/modules.txt
Auto-merging go.sum
Auto-merging go.mod
CONFLICT (content): Merge conflict in go.mod
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 vendor: add notify package
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@haircommander: #5067 failed to apply on top of branch "release-1.20":

Applying: vendor: add notify package
Using index info to reconstruct a base tree...
M	go.mod
M	go.sum
M	vendor/modules.txt
Falling back to patching base and 3-way merge...
Auto-merging vendor/modules.txt
CONFLICT (content): Merge conflict in vendor/modules.txt
Auto-merging go.sum
CONFLICT (content): Merge conflict in go.sum
Auto-merging go.mod
CONFLICT (content): Merge conflict in go.mod
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 vendor: add notify package
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@haircommander: #5067 failed to apply on top of branch "release-1.19":

Applying: vendor: add notify package
Using index info to reconstruct a base tree...
M	go.mod
M	go.sum
M	vendor/modules.txt
Falling back to patching base and 3-way merge...
Auto-merging vendor/modules.txt
CONFLICT (content): Merge conflict in vendor/modules.txt
Auto-merging go.sum
CONFLICT (content): Merge conflict in go.sum
Auto-merging go.mod
CONFLICT (content): Merge conflict in go.mod
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 vendor: add notify package
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载