Canary Endpoints:
const ( BlueService SelectorName = "blue" GreenService SelectorName = "green" InitialCanaryService SelectorName = "blue" )
Upgrading a Canary Service/Deployment
Upgrading a deployment/canary service is done by specifying the image hash value and putting it on the canary service. This affects the active canary service. See below for examples.
Update active canary service:
POST /services { "selector": "green" }
If the active service is already the selector given, then a message telling the user is provided:
Before:
iiq-wp-platform git:(dev) kc get configmap -n wp-platform -o yaml apiVersion: v1 items: - apiVersion: v1 data: canaryEnv: '{"selector":"blue","image":"961406424767.dkr.ecr.us-west-2.amazonaws.com/rezfusion-cloud:dev"}' kind: ConfigMap ...
After request:
POST to /services { "selector": "green" }
➜ iiq-wp-platform git:(dev) kc get configmap -n wp-platform -o yaml apiVersion: v1 items: - apiVersion: v1 data: canaryEnv: '{"selector":"green","image":"961406424767.dkr.ecr.us-west-2.amazonaws.com/rezfusion-cloud:dev"}' kind: ConfigMap
Then simply update the image for the active canary service:
POST /services { "imageTag": "216c0204fa1e71f93603c0d5087ef16d6b2ba5bce9084874bf9b2aebcddebc77", "promote": false, // Denote if this is a promote or upgrade. Promote runs an additional // step to PatchIngress. }
➜ iiq-wp-platform git:(dev) kc describe -n wp-platform deployment/green-deployment Name: green-deployment Namespace: wp-platform CreationTimestamp: Wed, 05 Apr 2023 15:06:04 -0600 Labels: app=green Annotations: deployment.kubernetes.io/revision: 63 Selector: app=green Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=green Annotations: kubectl.kubernetes.io/restartedAt: 2023-04-11T05:00:45-06:00 Containers: web: Image: 961406424767.dkr.ecr.us-west-2.amazonaws.com/rezfusion-cloud@sha256:1274e8bc4d963536e88265781170e72a2939caae25127cd9d400ab124393f946
Functionality & Examples
Create
Creating a site creates the bare-minimum configuration to represent a site:
POST /sites/create { "id": "rrr", "hostnames": ["rrr.cloud2-stg.rezfusion.com"], "service": "blue", "name": "Project Bluelaunch | rrr", "canonicalHostname": "rrr.cloud2-stg.rezfusion.com" }
This will create the AWS Dynamo DB entry for the site as well as relevant secrets for accessing the database and WP CMS.
Fetching Site Details
Site details can be viewed by running a GET
request against /sites/rrr
:
GET /sites/{site-id}
Provision
After creating a site, it is able to be provisioned. Provisioning a site means that an actual site instance, along with relevant databases created, S3 bucket subdirectories, etc. for the site to use. This endpoint kicks off the installation of a WP site, a job to add an Ingress entry, and a job to activate the desired theme for a given site.
POST /sites/{site-id}/provision
Assuming all goes well, a site should be visible at the hostname configured during /sites/create
within 5-10 minutes.
rrr-activate-theme-78-wqzk5 0/1 Completed 0 16m rrr-install-78-zdlsg 0/1 Completed 0 16m
Upgrade
Upgrading a site is the act of moving it from the current service to the active canary service. When a site upgrades the following queued jobs are triggered.
Jobs Executed:
PatchService - updates the entry in the sites repository so the site.Service value reflects the service a site is moved to/actively on.
FlushCaches - flushes the caches on the WP site.
PUT /sites/{site-id}/upgrade
rrr-cache-flush-79-ptwfc 0/1 Completed 0 20s
Promote
Promoting, almost identical to upgrading, moves a site to the active Canary service. Additionally, this triggers several queued jobs on the given site after moving to the new image.
Jobs executed:
FlushCaches
PatchIngress
PatchService
PUT /sites/{site-id}/promote
Bulk Promotions/Upgrades
To roll an upgrade or promotion out for all sites on a given service, simply execute a PUT
request against the /services
endpoint.
Upgrade
PUT /services { "imageTag": "f2e9a5d05ef6fe7714962afaa468d643b6c7195656ffc86b15b0435daeef91a4", "promote": "false" }
Promote
PUT /services PUT /services { "imageTag": "f2e9a5d05ef6fe7714962afaa468d643b6c7195656ffc86b15b0435daeef91a4", "promote": "true" }
Reverting Bad Deployments/Rollbacks
Example of broken deployment:
Once a healthy image hash is identified, we can simply update the canary deployment (or update the image on production if that one has gone bad) and the pods are restarted with the fresh code changes. The benefit of using dynamic image hashes during deployment is that our deployments themselves don’t lose historical context every rollout.
Once a healthy version of the WP app finishes building, it is ready to be deployed.
The pods, as well as any pipeline update hooks (like cache clearing, etc.) will run automatically.
➜ iiq-wp-platform git:(dev) kc describe -n wp-platform deployment/green-deployment Name: green-deployment Namespace: wp-platform CreationTimestamp: Wed, 05 Apr 2023 15:06:04 -0600 Labels: app=green Annotations: deployment.kubernetes.io/revision: 64 Selector: app=green Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=green Annotations: kubectl.kubernetes.io/restartedAt: 2023-04-11T05:00:45-06:00 Containers: web: Image: 961406424767.dkr.ecr.us-west-2.amazonaws.com/rezfusion-cloud@sha256:216c0204fa1e71f93603c0d5087ef16d6b2ba5bce9084874bf9b2aebcddebc77
Termination
Terminating a site is done by sending a request to /sites/{site-id}
with the DELETE action specified:
This will delete the relevant AWS Secrets entries, database related to the site, S3 site objects (aka site files) and removes the ingress entry for the site. This removes the site and all related data.
Before:
Sent request:
After:
AWS Secrets are deleted (with a 7 day recovery period):
DB is deleted
Ingress entry removed
Directory deleted in S3
Site is gone