-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add endpoint call for pod freezer #11802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #11802 +/- ##
==========================================
- Coverage 87.54% 87.47% -0.08%
==========================================
Files 196 196
Lines 9506 9515 +9
==========================================
+ Hits 8322 8323 +1
- Misses 912 919 +7
- Partials 272 273 +1
Continue to review full report at Codecov.
|
pkg/queue/concurrency_state.go
Outdated
// runs the `resume` function. If either of `pause` or `resume` are not passed, it runs | ||
// the respective local function(s). The local functions are the expected behavior; the | ||
// function parameters are enabled primarily for testing purposes. | ||
func ConcurrencyStateHandler(endpoint string, h http.Handler, logger *zap.SugaredLogger, pause, resume func()) http.HandlerFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
best to keep logger as first param for consistency
pkg/queue/concurrency_state.go
Outdated
func ConcurrencyStateHandler(endpoint string, h http.Handler, logger *zap.SugaredLogger, pause, resume func()) http.HandlerFunc { | ||
logger.Info("Concurrency state tracking enabled") | ||
|
||
if pause == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd drop this nil check, if someone is explicitly passing in a nil function here they deserve a panic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought here was to have default pause/resume functions, so a nil value here would mean to use the default. But maybe it's better to explicitly include the pause/resume functions in the call from queue/main.go ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I'd pull the functions out and explicitly inject them. You're going to need to do this I think to comfortably test them anyway (which we should do in this PR).
pkg/queue/concurrency_state.go
Outdated
logger.Info("Concurrency state tracking enabled") | ||
|
||
if pause == nil { | ||
pause = func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably pull this out to a separate top-level function or interface implementation rather than defining the default inline here. That should also be a lot easier to test (which we should do :)).
pkg/queue/concurrency_state.go
Outdated
if pause == nil { | ||
pause = func() { | ||
logger.Info("Requests dropped to zero, pausing pod ...") | ||
pauseEndpoint := endpoint + ":9696/freeze" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we hard-code the port and path here or let the endpoint env var define that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Port I think is fine to include in the envvar. If we include path in the envvar, then I think we'd need to pass freeze/resume as part of the request (or use two endpoints, but I don't think that's a good idea).
Not sure which option makes more sense (keeping path hard coded or adding the action to the request) makes more sense. In either case, I think we'll ultimately have to define what is required from any custom "pod freezer" service that gets called out to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we add the path then I think we probably want to be more generic than freeze and thaw. Another option though would be to use a single endpoint and pass a json structure to it (so it would be the same endpoint either way, but the json structure would define whether the event was to-zero or from-zero).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. The more I think about it, a single endpoint with the action taken in the request itself seems like the right path to take.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@julz Another thought -- if the path is in the envvar, then we don't necessarily need separate pause/resume functions. We could use a single function and pass the JSON as a parameter.
If that approach makes sense, then perhaps I should update the function signature in #11783 before it merges so that it stays consistent...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may still want it as separate functions to make it easy to test (otherwise you'd have to assert on the json being sent), but we can always refactor as we play with this more. We could also have a single function with a boolean or something. Probably adding some unit tests would be helpful at this point anyway, and then I can stop guessing about what will or won't make things more testable ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was writing the tests that triggered my thought (and also led me to the same conclusion you have, that it's easier to test with separate functions) :)
pkg/queue/concurrency_state.go
Outdated
req.Header.Add("Token", "") // TODO: use serviceaccountToken from projected volume | ||
resp, err := http.DefaultClient.Do(req) | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to think through these failure cases, a panic doesn't seem good enough here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud a bit here...
If the resume call fails, in an ideal state I think I'd want to log the failure and then retry at least one more time (perhaps it was just due to a network blip?). If still unable to resume the container, then I think I'd want to launch another pod and send the request to it (though not sure how feasible that would be). The key here is to get an active pod back up.
If the case where the pause call fails, I think that logging the failure and retrying might be sufficient. A not-paused paused container doesn't really negatively impact the end-user (their service would've scaled down anyways). That said, an alternative might be to kill the pod and respawn it in a paused state (which might be a better choice from a provider perspective).
@julz what's your take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my instinct is let's spin out an issue for retrying, maybe another for more generally thinking through what should happen here (e.g. should we fail readiness/liveness probes maybe so that k8s routes around us/kills us and brings up another container?), and then for right now land this with a panic (since it's an experimental branch throwing our hands in the air is probably better than logging, and recovering needs more thought - but I would like an issue to exist for at least retrying).
@markusthoemmes any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I've reverted to the panic, but kept the log message as I think it's still useful info to have)
pkg/queue/concurrency_state.go
Outdated
// the respective local function(s). The local functions are the expected behavior; the | ||
// function parameters are enabled primarily for testing purposes. | ||
func ConcurrencyStateHandler(endpoint string, h http.Handler, logger *zap.SugaredLogger, pause, resume func()) http.HandlerFunc { | ||
logger.Info("Concurrency state tracking enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this duplicates the log message on line 302 of main.go, I don't think we really need to print this at Info level twice.
5439e6f
to
0f574bb
Compare
pkg/queue/concurrency_state.go
Outdated
// Resume sends to an endpoint when request counts increase from zero | ||
func Resume(endpoint string) error { | ||
action := ConcurrencyStateMessageBody{Action: "resume"} | ||
bodyText := []byte(`{"action": "` + action.Action + `"}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we're building this with string concatenation when we've created a struct with json tags on the line above? Should this be json.Marshal(action)
instead?
pkg/queue/concurrency_state.go
Outdated
logger.Info("Requests dropped to zero") | ||
err := pause(endpoint) | ||
if err != nil { | ||
logger.Errorf("Issue handling pause request: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not convinced this is better than the panic tbh. We need to think about how we actually want to behave here. Should we retry the request? Should we exit the entire server? Panic may be all we can do, but I think we should decide explicitly if that's the case. This might be something to talk through on Wednesday / in an ad-hoc meeting actually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I sketched out some thoughts here. This was a start along those lines, but didn't want to go too far before we discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops missed that comment, let me go read now!
pkg/queue/concurrency_state.go
Outdated
logger.Info("Requests increased from zero") | ||
err := resume(endpoint) | ||
if err != nil { | ||
logger.Errorf("Issue handling resume request: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto here, if we fail to resume and then send a request to the paused pod anyway that's not going to end well obviously. A panic is better than that, but we should be clear if we want to try retrying instead, or maybe the pod should fail a readiness/liveness check or something so k8s will replace it? (Some of this might well be follow-on PRs, just thinking out loud and making sure we don't lose this thread because I think it's important - and just logging definitely seems wrong I think).
5a9fc4b
to
1919812
Compare
pkg/queue/concurrency_state.go
Outdated
// runs the `resume` function. If either of `pause` or `resume` are not passed, it runs | ||
// the respective local function(s). The local functions are the expected behavior; the | ||
// function parameters are enabled primarily for testing purposes. | ||
func ConcurrencyStateHandler(h http.Handler, logger *zap.SugaredLogger, pause, resume func(string) error, endpoint string) http.HandlerFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason it is a public function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be public to be accessible to the buildServer
function.
pkg/queue/concurrency_state.go
Outdated
// the respective local function(s). The local functions are the expected behavior; the | ||
// function parameters are enabled primarily for testing purposes. | ||
func ConcurrencyStateHandler(h http.Handler, logger *zap.SugaredLogger, pause, resume func(string) error, endpoint string) http.HandlerFunc { | ||
logger.Info("Concurrency state endpoint set, tracking request counts, using endpoint ", endpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.Info("Concurrency state endpoint set, tracking request counts, using endpoint ", endpoint) | |
logger.Info("Concurrency state endpoint set, tracking request counts, using endpoint: ", endpoint) |
pkg/queue/concurrency_state.go
Outdated
logger.Info("Requests dropped to zero") | ||
err := pause(endpoint) | ||
if err != nil { | ||
logger.Errorf("Issue handling pause request: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.Errorf("Issue handling pause request: %s", err) | |
logger.Errorf("Error handling pause request: %v", err) |
pkg/queue/concurrency_state.go
Outdated
logger.Info("Requests increased from zero") | ||
err := resume(endpoint) | ||
if err != nil { | ||
logger.Errorf("Issue handling resume request: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.Errorf("Issue handling resume request: %s", err) | |
logger.Errorf("Error handling resume request: %v", err) |
timeout := time.Duration(env.RevisionTimeoutSeconds) * time.Second | ||
|
||
// Create queue handler chain. | ||
// Note: innermost handlers are specified first, ie. the last handler in the chain will be executed first. | ||
var composedHandler http.Handler = httpProxy | ||
if concurrencyStateEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the config string don't need to be "freeze-proxy"? according to: https://github.com/knative/serving/pull/11699/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the config string should be the service endpoint of the freeze service and its port. For example, assuming there was a freeze-service
running in the knative-serving
namespace using port 9696
, the config value could be: freeze-service.knative-serving.svc.cluster.local:9696
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
f274ce8
to
b5360ec
Compare
b5360ec
to
facfb67
Compare
@julz @markusthoemmes @vagababov think this one is ready to go now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments on getting this a bit more DRY and maybe layered slightly differently.
pkg/queue/concurrency_state.go
Outdated
logger.Info("Requests dropped to zero") | ||
if err := pause(endpoint); err != nil { | ||
logger.Errorf("Error handling pause request: %v", err) | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this panic crash the entire pod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on some local testing, I believe it only kills the queue-proxy container. The user-container container doesn't seem to be impacted. Here's some kubectl output when a panic occurs:
$ k get po -w
NAME READY STATUS RESTARTS AGE
example-00001-deployment-598f4f7b76-7rngf 2/2 Running 0 107s
example-00001-deployment-598f4f7b76-7rngf 1/2 Error 0 111s
example-00001-deployment-598f4f7b76-7rngf 1/2 Running 1 112s
example-00001-deployment-598f4f7b76-7rngf 1/2 Running 1 113s
example-00001-deployment-598f4f7b76-7rngf 2/2 Running 1 114s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's fine. I guess we'll want the entire pod to be recreated if that happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/queue/concurrency_state.go
Outdated
action := ConcurrencyStateMessageBody{Action: "pause"} | ||
bodyText, err := json.Marshal(action) | ||
if err != nil { | ||
return fmt.Errorf("unable to create request body: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is static content anyway, can we cache it? Seems wasteful to marshal this on each request (same below)
pkg/queue/concurrency_state.go
Outdated
func Resume(endpoint string) error { | ||
action := ConcurrencyStateMessageBody{Action: "resume"} | ||
bodyText, err := json.Marshal(action) | ||
if err != nil { | ||
return fmt.Errorf("unable to create request body: %w", err) | ||
} | ||
body := bytes.NewBuffer(bodyText) | ||
req, err := http.NewRequest(http.MethodPost, endpoint, body) | ||
if err != nil { | ||
return fmt.Errorf("unable to create request: %w", err) | ||
} | ||
req.Header.Add("Token", "nil") // TODO: use serviceaccountToken from projected volume | ||
resp, err := http.DefaultClient.Do(req) | ||
if err != nil { | ||
return fmt.Errorf("unable to post request: %w", err) | ||
} | ||
if resp.StatusCode != http.StatusOK { | ||
return fmt.Errorf("expected 200 response, got: %d: %s", resp.StatusCode, resp.Status) | ||
} | ||
|
||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the exact same function as above modulo Action: "resume"
. Shall we deduplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. I recall discussing in one of the other PRs keeping the pause/resume functions separate for testing purposes, but no need for identical functions. I can dedupe to a single function (say func concurrencyStateRequest(b ConcurrencyStateMessageBody) error {}
(removing endpoint as per comment below).
One question -- would it still be worthwhile to keep distinct pause/resume functions (where each would just call the concurrencyStateRequest
function with the relevant parameter? I could see it help with readability, but I may be overthinking things a bit too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's fine to have both as readability wrappers IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I could also see the form ultimately being:
In main.go
resume := concurrencyStateRequest(endpoint, ConcurrencyStateMessageBody{Action: "resume"})
pause := concurrencyStateRequest(endpoint, ConcurrencyStateMessageBody{Action: "pause"})
ConcurrencyStateHandler(logger, handler, resume, pause)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I misread the comment below... yes, passing the endpoint directly to the resume/pause functions (and skipping the Handler function entirely) makes total sense. Thanks for the clarification!
pkg/queue/concurrency_state.go
Outdated
@@ -85,3 +88,55 @@ func ConcurrencyStateHandler(logger *zap.SugaredLogger, h http.Handler, pause, r | |||
<-done | |||
} | |||
} | |||
|
|||
// Pause sends to an endpoint when request counts drop to zero | |||
func Pause(endpoint string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The layering with passing endpoint
in here seems a little odd. It feels like it should be captured in the functions passed to ConcurrencyStateHandler
so it itself wouldn't care what to pass to these functions.
pkg/queue/concurrency_state.go
Outdated
bodyText, err := json.Marshal(action) | ||
if err != nil { | ||
return fmt.Errorf("unable to create request body: %w", err) | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% happy with a panic here, so if there's a better way to go I'm all ears
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do this, find a approach to relaunch a pod
835feff
to
c4197f3
Compare
c4197f3
to
d1932d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign
pkg/queue/concurrency_state.go
Outdated
logger.Info("Requests increased from zero") | ||
if err := resume(); err != nil { | ||
logger.Errorf("Error handling resume request: %v", err) | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This panic won't actually do anything anymore as it's now situated directly in ServeHTTP
where panics are usually caught by the wrapping server. We might need an explicit os.Exit(1)
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think you +1ed this comment but forgot to do it @psschwei :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, looks like I did it on the one above, but not here 🤦
pkg/queue/concurrency_state.go
Outdated
paused = false | ||
logger.Info("From-Zero request successfully processed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug?
pkg/queue/concurrency_state.go
Outdated
paused = true | ||
logger.Info("To-Zero request successfully processed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug?
cmd/queue/main.go
Outdated
pauseMsg, err := queue.CreateConcurrencyStateMessageBody(queue.ConcurrencyStateMessageBody{Action: "pause"}) | ||
if err != nil { | ||
logger.Errorf("unable to create concurrency state pause message body: %s", err) | ||
} | ||
resumeMsg, err := queue.CreateConcurrencyStateMessageBody(queue.ConcurrencyStateMessageBody{Action: "resume"}) | ||
if err != nil { | ||
logger.Errorf("unable to create concurrency state resume message body: %s", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems quite redundant. Why not pass a ConcurrencyStateMessageBody
to ConcurrencyStateRequest
and marshal in the function before returning the closure? Seems better API design too, since ConcurrencyStateRequest
taking []byte
is a bit more generic than seems healthy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally was passing a ConcurrencyStateMessageBody
(though without a separate function for marshalling), and for some reason then the resume
call would only work once (on any subsequent call, it was throwing a 400
error). Now, however, using CreateConcurrencyStateMessageBody
within ConcurrencyStateRequest
works fine. 🤷 Updating now.
d1932d2
to
2593a40
Compare
pkg/queue/concurrency_state.go
Outdated
func CreateConcurrencyStateMessageBody(action ConcurrencyStateMessageBody) ([]byte, error) { | ||
return json.Marshal(action) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can drop this, I think.
pkg/queue/concurrency_state.go
Outdated
_ = fmt.Errorf("unable to create message body: %w", err) | ||
os.Exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this does what you think it does. It won't log the message for instance.
I think we should surface the error through the return and logger.Fatal
where this function is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a silly question, but given that this function returns a edit: yes, it was a silly question 🤦func() error
, how would we return just the error here?
Failing tests were from /test pull-knative-serving-unit-tests |
pkg/queue/concurrency_state.go
Outdated
logger.Info("Requests increased from zero") | ||
if err := resume(); err != nil { | ||
logger.Errorf("Error handling resume request: %v", err) | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think you +1ed this comment but forgot to do it @psschwei :)
pkg/queue/concurrency_state.go
Outdated
mux.Unlock() | ||
|
||
h.ServeHTTP(w, r) | ||
} | ||
} | ||
|
||
// ConcurrencyState Request sends to the concurrency state endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// ConcurrencyState Request sends to the concurrency state endpoint | |
// ConcurrencyStateRequest sends a request to the concurrency state endpoint. |
pkg/queue/concurrency_state.go
Outdated
}, nil | ||
} | ||
|
||
type ConcurrencyStateMessageBody struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total nit, but I don't think the Body
suffix is useful here. There's no metadata section or anything going on, so just ConcurrencyStateMessage
seems fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the actually see below for a suggestion involving just dropping this struct anyway
pkg/queue/concurrency_state.go
Outdated
mux.Unlock() | ||
|
||
h.ServeHTTP(w, r) | ||
} | ||
} | ||
|
||
// ConcurrencyState Request sends to the concurrency state endpoint | ||
func ConcurrencyStateRequest(endpoint string, action ConcurrencyStateMessageBody) (func() error, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at this, I wonder if it's really worth exporting a struct for this parameter, or if we could just accept a string with the action to send? That would also let us drop the the json marshalling (and therefore the error return, and the struct).
if we did this, then we could also maybe do something like:
func Pause(endpoint string) func() error {
return concurrencyStateRequest(endpoint, "pause")
}
func Resume(endpoint string) func() error {
return concurrencyStateRequest(endpoint, "resume")
}
func concurrencyStateRequest(endpoint, action string) (func() error, error) {
body := fmt.Sprintf(`{ "action": %q }`, action)
...
If we do want to have a struct, I think I'd put the methods on it rather than passing it in, which would let us drop the closures:
type ConcurrencyEndpoint struct { endpoint string }
func (c ConcurrencyEndpoint) Pause() error {
...
}
func (c ConcurrencyEndpoint) Resume() error {
...
}
endpoint := queue.ConcurrencyEndpoint(endpoint: address}
composedHandler = queue.ConcurrencyStateHandler(logger, composedHandler, endpoint.Pause, endpoint.Resume))
(that could then easily become a PauseResumer interface, but maybe in a future PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll probably want to send other info along with the freeze/pause message (pod name, for one; if for nothing else than debugging, it'd be good to know which pod is being paused and which is being resumed. Was thinking of that as a separate PR, since this is already quite large). For those reasons, I'd lean towards keeping the struct.
Putting the methods on the struct makes sense.
edit: Although as I'm thinking through it more, we should be able to add multiple parameters in the string too, so maybe the simpler, string-based approach is the better one.
pkg/queue/concurrency_state.go
Outdated
mux.Unlock() | ||
|
||
h.ServeHTTP(w, r) | ||
} | ||
} | ||
|
||
// ConcurrencyState Request sends a request to the concurrency state endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// ConcurrencyState Request sends a request to the concurrency state endpoint. | |
// concurrencyStateRequest sends a request to the concurrency state endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: julz, psschwei The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Are there documentation/examples on how to use this feature? |
For now, the setup process is documented in the README for the container-freezer sandbox repo. We're still in the process of getting documentation into the official docs. There's also a tracking doc that gives a little more detail about what the freezer is doing and why. Also, one thing to note is that the freezing capability is still very much a work in progress at the moment (for example, the code that actually does the freezing hasn't merged to the main branch just yet). |
Fixes #11832
Proposed Changes
This is part of the work to add the Pod Freezer capability to knative (see #11694 for more details)
Note: there is a skeleton of a freeze/thaw service available here, which is a refactoring of julz's freeze-proxy daemon into a standalone service.
Release Note