+
Skip to content

Conversation

jgreat
Copy link
Contributor

@jgreat jgreat commented Apr 20, 2022

Motivation

Breaking this work up into multiple parts to hopefully make this easier to review.

  • Part 1: Mobilecoind python tests (should match what's in master now) Docker and helper scripts.
  • Part 2: Helm charts
  • Part 3: Github Actions and test wrappers.

The main goal for refactoring the release workflow was to enable block_version 0 -> block_version 1 testing.

  1. Build rust and go binaries.
  2. Build/Publish Docker images.
  3. Build/Publish Helm Charts
  4. Deploy "previous" release (v1.1.3)
  5. Run integration tests against current release.
  6. Upgrade to current release block_version=0.
  7. Run integrations tests.
  8. Upgrade consensus to block_version=1
  9. Run integration tests.

CI/CD "improvements"

  • Updated OS and utilities for build and runtime.
  • Slimmed down build and runtime images.
  • Using new build image will provide a static, versioned, verifiable and repeatable build process.
  • pre-install of rust/cargo targets.
  • Versioned helm charts published to S3 repo for each release tied to the versioned docker images.
  • More consistent runtime environment setup through helm configuration sub-charts.
  • GitHub Actions with private runners.
  • Dynamically built dev environments for feature/* branches.
  • Ability to retry failed steps.
  • Skip build/ci with head commit messages.
  • Build -dev release on release/*
  • reduce chance of "shared" secret leaks, generate unique dev env secrets and keys.
  • Manual ad-hoc dev environment actions: deploy, reset, delete, test

In this PR

Refactor of helm charts used for consensus and fog deployments.

.internal-ci/helm

One of the goals was to abstract shared and local configuration values from the runtime of the charts and do "Zero Config" deployments and upgrades of the applications. My intention is to allow environment specific (config and secrets for TestNet vs MainNet...) configuration to be defined and set externally from the application deployments.

Assuming configuration is relatively static, this should minimize the human error factor when we deploy Consensus and Fog services. Of course the real world may have other ideas 🤷

To achieve this, the helm charts are now split into "config" charts and application charts. Although most of the time we would run config separately from the app charts, the app charts set the appropriate config charts as optional dependencies for a more convenient install outside our systems.

A secondary goal was having a repository of static versioned charts (configuration and deployment instructions) that is tested with and directly correlates with a specific build. This should eliminate the pain point of build/deployment config drift from the time we deployed the core apps. The trade off is that updates to the deployment and or tooling now have to follow the application lifecycle, i.e. at least a point patch and follow up to future release and edge branches.

Charts and descriptions

  • mc-core-dev-env-setup - Chart that takes all the various config charts and bundles them into a convenient single step deployment for our dev environments.
  • mc-core-common-config - common config elements needed for any of the core apps. client auth, ias, network for monitoring, mobilecoind configs.
  • consensus-node[-config] - Deploy a single consensus node.
  • fog-ingest[-config] - Deploy a blue or green fog-ingest.
  • fog-services[-config] - Deploy fog services (report, view, ledger).
  • mobilecoind - Deploy a standalone mobilecoind with api endpoint enabled.
  • watcher - Deploy a copy of the AVR Watcher service.
  • fog-test-client - Deploy fog-test-client in canary mode.

Future Work

  • manual dispatch workflow to build artifacts for TestNet and MainNet deployments.
  • refactor of entrypoint scripts to align internal deployment configuration with partner deployments.
  • generate fog-report signing keys, instead of using shared key.

@jgreat jgreat requested review from a team, MCrank and joekottke April 20, 2022 16:56
@wjuan-mob wjuan-mob self-requested a review April 21, 2022 17:47
Copy link
Contributor

@joekottke joekottke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having talked with @jgreat often through the process of this, as well as going through the walk-through, I'm giving an LGTM. Will continue to consume the chart code, but didn't want to hold up the review process,

Copy link

@MCrank MCrank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - This was a ton of work and is somewhat of a Modern Marvel

@jgreat jgreat merged commit d25894a into mobilecoinfoundation:release-1.2.0 Apr 22, 2022

`fog-ingest` is only designed to have one active instance. We should run at-least 2 in order to have a hot standby incase the active instance fails. Scaling the replicas doesn't improve performance.

The peer list generation happens when the chart is generated. In order to scale the fog-ingest service you should adjust the `fogIngest.replicaCount` value and upgrade the fogIngest. The peer list is added to the ConfigMap additional pods will be added, but existing pods will not automatically update. Either destroy and re-create the pods or execute a restart of the fog services with supervisord.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The peer list generation happens when the chart is generated. In order to scale the fog-ingest service you should adjust the `fogIngest.replicaCount` value and upgrade the fogIngest. The peer list is added to the ConfigMap additional pods will be added, but existing pods will not automatically update. Either destroy and re-create the pods or execute a restart of the fog services with supervisord.
The peer list generation happens when the chart is generated. In order to scale the fog-ingest service you should adjust the `fogIngest.replicaCount` value and upgrade the fogIngest. When the peer list is added to the ConfigMap additional pods will be added, but existing pods will not automatically update. Either destroy and re-create the pods or execute a restart of the fog services with supervisord.


- `supervisord-mobilecoind`

`mobilecoind` configuration for in container supervisord. Example values are for MobileCoin MainNet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`mobilecoind` configuration for in container supervisord. Example values are for MobileCoin MainNet.
`mobilecoind` configuration in container supervisord. Example values are for MobileCoin MainNet.


- `fog-ingest` ConfigMap

Database connection configuration for fog-ingest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you noted this in the file itself, but is it worth also noting that: "For helm deployed postgres, set configMap.enabled and secret.enabled true" here?

{{- $salt }}
{{- end }}

{{/* fogViewHTTPCookieSalt - reuse existing password */}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a bit more comment on how these salt functions are supposed to work?

--client-responder-id "%(ENV_CLIENT_RESPONDER_ID)s"
{{- if (include "fogServices.clientAuth" .) }}
--client-auth-token-secret "%(ENV_CLIENT_AUTH_TOKEN_SECRET)s"
--client-auth-token-max-lifetime 31536000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this number arbitrary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载