这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@pietroalbini
Copy link
Member

This PR adds a script to kick off a new release, invoking the lambda created in rust-lang/simpleinfra#441.

In addition to invoking the lambda, the script ensures authentication with AWS properly works, and guides the caller in setting up the proper environment. This should help the members of t-release who don't regularly interact with our AWS environment.

@pietroalbini
Copy link
Member Author

Moved this to a draft as I discovered a fairly bad bug in the lambda that will require changes in this script too (if the CodeBuild job spends too much time starting up, the lambda will queue multiple of them).

@pietroalbini pietroalbini marked this pull request as ready for review July 14, 2024 20:20
@pietroalbini
Copy link
Member Author

Ok I fixed the issue I encountered.

The problem was that the lambda waits for the build to start, which might take more than a minute if there are too many queued CodeBuild jobs in the region (I often saw it taking between one and two minutes in the queue).

While the lambda did have a timeout of 15 minutes to account for this, the AWS CLI also has its own socket read timeout of a minute, and by default retries the API invocation three times when it encounters a socket read timeout. So, if the lambda didn't finish within a minute (aka if CodeBuild took more than a minute to provision a builder), AWS CLI would retry starting the lambda, resulting in more than one CodeBuild builder being started.

To address this I explicitly configured the max attempts to be one, and increased the socket read timeout to 15 minutes.

@Mark-Simulacrum
Copy link
Member

Are we synchronously invoking the lambda? I guess that might be the issue - I'd expect that an async invocation doesn't have a socket timeout tied into the execution progressing?

But the fix sounds good to me.

@pietroalbini
Copy link
Member Author

Are we synchronously invoking the lambda? I guess that might be the issue - I'd expect that an async invocation doesn't have a socket timeout tied into the execution progressing?

Yes we are synchronously invoking it, as we need to retrieve its outputs.

@pietroalbini pietroalbini merged commit 89b1797 into main Jul 15, 2024
@pietroalbini pietroalbini deleted the pa-start-release branch July 15, 2024 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants