In our previous post, we configured our GKE Concourse CI server, which was the capstone of the series. But we were wrong: this post is the capstone in the series. In this post, we install Vault and configure our Concourse CI server to use Vault to retrieve secrets.
Installation
Most of these instructions are derived from the Hashicorp tutorial, Vault on Kubernetes Deployment Guide.
Create a DNS A record which points to the IP address of your GKE load balancer.
In our case, we created vault.nono.io
which points to 104.155.144.4
.
Create the vault namespace and deploy the TLS issuers to that namespace.
Replace brian.cunnie@gmail.com
with your email address:
kubectl create namespace vault
kubectl apply -f <(
curl -o- https://cert-manager.io/docs/tutorials/acme/example/staging-issuer.yaml |
sed 's/user@example.com/brian.cunnie@gmail.com/') -n vault
kubectl apply -f <(
curl -o- https://cert-manager.io/docs/tutorials/acme/example/production-issuer.yaml |
sed 's/user@example.com/brian.cunnie@gmail.com/') -n vault
Let’s create vault-values.yml
, which contains the necessary customizations
for our Vault server. Replace vault.nono.io
with your DNS record:
injector:
enabled: false
server:
ingress:
enabled: true
labels:
traffic: external
annotations:
cert-manager.io/issuer: "letsencrypt-prod"
kubernetes.io/ingress.class: "nginx"
hosts:
- host: vault.nono.io
paths: []
tls:
- hosts:
- vault.nono.io
secretName: vault.nono.io
ui:
enabled: true
Let’s use Helm to install Vault:
helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault \
--namespace vault \
-f vault-values.yml \
--wait
Let’s initialize our pristine Vault server. We want only one key
("secret_shares": 1
) to unseal the vault; we’re not a nuclear missile silo
that needs three separate keys to trigger a launch.
curl \
--request POST \
--data '{"secret_shares": 1, "secret_threshold": 1}' \
https://vault.nono.io/v1/sys/init | jq
Record root_token
and keys
; you’ll need them to unseal the Vault and log
in, respectively.
Replace VAULT_KEY
and VAULT_TOKEN
with your keys
and root_token
:
export VAULT_KEY=5a302397xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export VAULT_TOKEN=s.QmByxxxxxxxxxxxxxxxxxxxx
export VAULT_ADDR=https://vault.nono.io
curl \
--request POST \
--data '{"key": "'$VAULT_KEY'"}' \
$VAULT_ADDR/v1/sys/unseal | jq
# check initialization status
curl $VAULT_ADDR/v1/sys/init
Concourse Integration
Much of this section was shamelessly copied from the excellent canonical Concourse documentation, The Vault credential manager.
Configure the Secrets Engine
Create a key-value concourse/
path in Vault for Concourse to access its
secrets:
vault secrets enable -version=1 -path=concourse kv
Create concourse-policy.hcl
so that our Concourse server has access to that
path:
path "concourse/*" {
policy = "read"
}
Let’s upload that policy to Vault:
vault policy write concourse concourse-policy.hcl
Configure a Vault approle
for Concourse
Let’s enable the approle
backend on Vault:
vault auth enable approle
Let’s create the Concourse approle
:
vault write auth/approle/role/concourse policies=concourse period=1h
We need the approle
’s role_id
and secret_id
to set in our Concourse
server:
vault read auth/approle/role/concourse/role-id
# role_id 045e3a37-6cc4-4f6b-4312-36eed80f7adc
vault write -f auth/approle/role/concourse/secret-id
# secret_id 59b8015d-8d4a-fcce-f689-xxxxxxxxxxxx
Configure Concourse to Use Vault
Let’s configure our Concourse deployment to use Vault. We append yet even more
arguments to our already-gargantuan [values_file]
command to deploy Concourse.
Replace gke-nono-io
with the name of your Helm release, gke.nono.io
with
the hostname of your Concourse server, and replace all the various and sundry
credentials, too:
helm upgrade gke-nono-io concourse/concourse \
--set concourse.web.externalUrl=https://gke.nono.io \
--set concourse.web.auth.duration=240h \
--set 'web.ingress.enabled=true' \
--set 'web.ingress.annotations.cert-manager\.io/issuer=letsencrypt-prod' \
--set 'web.ingress.annotations.kubernetes\.io/ingress.class=nginx' \
--set 'web.ingress.hosts={gke.nono.io}' \
--set 'web.ingress.tls[0].hosts[0]=gke.nono.io' \
--set 'web.ingress.tls[0].secretName=gke.nono.io' \
\
--set-file secrets.sessionSigningKey=secrets/session_signing_key \
--set-file secrets.hostKey=secrets/tsa_host_key \
--set-file secrets.hostKeyPub=secrets/tsa_host_key.pub \
--set-file secrets.workerKey=secrets/worker_key \
--set-file secrets.workerKeyPub=secrets/worker_key.pub \
\
--set secrets.localUsers="" \
--set concourse.web.localAuth.enabled=false \
--set concourse.web.auth.mainTeam.github.org=blabbertabber \
--set concourse.web.auth.github.enabled=true \
--set secrets.githubClientId=5e4ffee9dfdced62ebe3 \
--set secrets.githubClientSecret=549e10b1680ead9cafa30d4c9a715681cec9b074 \
\
--set concourse.web.vault.enabled=true \
--set concourse.web.vault.url=https://vault.nono.io:443 \
--set concourse.web.vault.authBackend=approle \
--set concourse.web.vault.useAuthParam=true \
--set secrets.vaultAuthParam="role_id:045e3a37-6cc4-4f6b-4312-36eed80f7adc\\,secret_id:59b8015d-8d4a-fcce-f689-xxxxxxxxxxxx" \
\
--wait
Gotchas: you need :443
at the end of the vault URL
[443],
and you need the double-backslash before the comma in the vaultAuthParam
[double_backslash].
Putting It Together
Let’s create a secret which we’ll interpolate into our pipeline. We create the
key under the Vault path concourse/main
(main
is our Concourse team’s name.
If you’re not sure what your Concourse team name, it’s probably main
):
vault kv put concourse/main/ozymandias-secret value="Look on my Works, ye Mighty, and despair\!"
Let’s create a simple Concourse pipeline definition, pipeline-ozymandias.yml
jobs:
- name: ozymandias-job
plan:
- task: ozymandias-task
config:
platform: linux
image_resource:
type: docker-image
source:
repository: fedora
run:
path: echo
args:
- "Ozymandias says:"
- ((ozymandias-secret))
Let’s fly
our new pipeline. Replace nono
with your Concourse target’s
name:
fly -t gke set-pipeline -p ozymandias-pipeline -c pipeline-ozymandias.yml
fly -t gke expose-pipeline -p ozymandias-pipeline
fly -t gke unpause-pipeline -p ozymandias-pipeline
fly -t gke trigger-job -j ozymandias-pipeline/ozymandias-job
Let’s browse to our job, expand ozymandias-task
by clicking on it, and allow
ourselves to luxuriate in the sweet smell of success:
If you instead see an aborted Concourse job with the error, failed to interpolate task config: undefined vars: ozymandias-secret
, you probably have
mangled the authentication credentials (vaultAuthParam
) setting.
What We Did Wrong
Hashicorp warns:
Vault should not be deployed in a public internet facing environment
We’re doing the exact opposite of what they suggest. If we’re going to the trouble of deploying Vault, we want to make sure we can use it from everywhere, security be damned; we don’t want to waste our time sprinkling separate Vault deployments like magic fairy dust on each of our environments.
Hashicorp also warns:
For a production-ready install, we suggest that the Vault Helm chart is installed in high availability (HA) mode
We’re installing in standalone mode, not HA mode. We think HA mode is overkill for our use case. Also, our GKE cluster is small because we pay for it out-of-pocket, and we don’t have the resources to spend on HA mode.
Addendum: Updating
Here’s the correct process for updating:
- Update vault. Then unseal the vault.
- Update Concourse.
Addendum: Motivation
“Why go to all this trouble to install Vault? Wasn’t the Concourse server enough?” you might ask.
The reason we installed Vault was that there were two things we wanted to accomplish that assumed Vault (or another secret-store such as CredHub):
We wanted a Concourse pipeline “to build and publish a container image”. The tutorial’s pipelines assumed a secret-store. We didn’t have one. We didn’t want to be losers that didn’t have a secret-store; we wanted to be one of the cool kids.
We wanted a Concourse pipeline that used Platform Automation to deploy a Tanzu Ops Manager. We’re part of the development team that supports Ops Manager, and we like having our own testbed (instead of the corporate ones) to develop/troubleshoot. Note: there’s nothing wrong with the corporate testbeds—they’re pretty awesome—but we like having our own personal testbed, as well as our own personal Concourse CI server.
“Why use Vault instead of CredHub? Isn’t CredHub is more tightly integrated into the Cloud Foundry ecosystem?” you might also ask. The answer is simple: we like a GUI. Yes, we know that using the GUI not as macho as using the CLI, but so what? A well-designed GUI can present data in a more digestible fashion than a CLI.
Also, we wanted to learn how to use Vault, and this was a good opportunity. Besides, Vault was written in Golang, which is more fashionable than CredHub’s Java (and starts more quickly, too).
Updates/Errata
2022-07-04 Added the order in which to apply Vault & Concourse updates.
Footnotes
Are you sick of the gargantuan command to deploy Concourse? Then do what we
do—use a Helm values file (e.g. concourse-values.yml
). You can see ours
here.
With that file our command to deploy Concourse becomes manageably smaller:
helm upgrade gke.nono.io concourse/concourse \
-f concourse-values.yml \
...
You may think, “That :443
at the end of https://vault.nono.io:443
is
redundant & superfluous; I’ll redact that on my version. Any programmer worth
his salt knows that the https
scheme defaults to port 443.”
Well I got news for you, Sunshine: you need that :443
; if you skip it, you’ll
get the dreaded, “failed to interpolate task config: timed out to login to
vault” error when your Concourse task attempts to interpolate a variable.
In some ways it’s similar to the openssl s_client -connect vault.nono.io:443
command which insists that you specify the port even though 99% of the time that
port is going to be 443. What, you’re not familiar with the openssl s_client -connect
command? Well stick around, it’ll come in useful when you have to
debug certs on someone else’s server.
You need the double backslash before the comma; if you don’t have it, the
following error will rear its ugly head when you attempt helm upgrade
:
Error: INSTALLATION FAILED: failed parsing --set data: key "secret_id:59b8015d-8d4a-fcce-f689-xxxxxxxxxxxx" has no value