Secret Management and Why It’s Important
Hi! My name is Evgeny, and I work as a Lead DevOps at Exante. In this article, I will discuss the practical experience of setting up a high-availability HashiCorp Vault with a GCP storage backend and auto unseal in Kubernetes (K8s).
Our infrastructure used to consist of thousands of virtual and physical machines hosting our legacy services. Configuration files, including plain-text secrets, were distributed across these machines, both manually and with the help of Chef.
We decided to change the company’s strategy for several reasons: to accelerate code delivery processes, ensure continuous delivery, securely store secrets, and speed up the deployment of new applications and environments.
We decided to transition our product to a cloud-native model, which required us to change our approach to development and infrastructure. This involved refactoring our legacy services, adopting a microservices architecture, deploying services in cloud-based Kubernetes (K8s), and utilizing managed resources like Redis and PostgreSQL.
In our situation, everything needed to change—from applications and infrastructure to how we distribute configs and secrets. We chose Google as our cloud provider and HashiCorp Vault for secret storage. We've since made significant progress on this journey.
Why HashiCorp Vault?
There were several reasons:
We needed a ready-to-use tool that could immediately improve our secret management without extensive modifications to our application codes. Additionally, it needed to integrate seamlessly with Chef.
HashiCorp Vault emerged as the most popular tool with the necessary functionality (secret storage, access management, injecting secrets into K8s pods, integration with Chef and GitLab CI).
Our DevOps engineers had practical experience in configuring HA and unseal.
There was also the future potential to integrate a Secret Injection Webhook into our K8s cluster.
Where and in What Configuration Do We Deploy HashiCorp Vault?
We deploy HashiCorp Vault in a K8s cluster on a dedicated node pool.
In terms of configuration, we use an automatic unseal approach using GCP's Key Management Service (GCP KMS) during HashiCorp Vault pod restart. Google Cloud Storage serves as the storage backend for storing HashiCorp Vault data, providing a straightforward and cost-effective solution for clustering the HashiCorp Vault service.
Why is automatic unseal so vital to us?
It's pretty simple: currently, most services receive secrets as environment variables via the Vault Secret Injection Webhook. This is why automatic unseal is crucial for us: without it, we could potentially face HashiCorp Vault service downtime and the inability to pass environment variables to services during pod manipulations (restarts, new releases, etc.).
What is required to set up HA with a GCS storage backend and GCP KMS auto unseal?
Service Account for HashiCorp Vault
GCS Bucket
Keyring + Cryptokey
We created these necessary entities using Terraform.
You can find more detailed information about the IAM policies for the service account in the official documentation: HashiCorp Vault GCS Storage Backend, and about the GCP KMS Keyring and Cryptokey in the official documentation: GCP KMS Integration.
How Do We Deploy HashiCorp Vault in K8s?
In our Kubernetes (K8s) cluster, we use a GitOps approach with the help of Flux CD. This implies that all new applications are deployed to the cluster via the Flux CD repository. In this article, we won't delve into the details of using GitOps.
We use SOPS to encrypt the initial secrets (necessary for deploying HashiCorp Vault). SOPS is an encrypted file editor that supports YAML, JSON, ENV, INI, and BINARY formats and encrypts using AWS KMS, GCP KMS, Azure Key Vault, age, and PGP. We use age, and the decryption key is stored in the K8s secrets.
We configure the HashiCorp Vault as follows:
HA Mode: 3 replicas
GCS Storage Backend
Auto unseal using GCP KMS
Steps to Deploy:
First, we create a secret named vault-gcs that contains our service account. This file is created using the template below and encrypts it using SOPS.
apiVersion: v1
kind: Secret
metadata:
name: vault-gcs
namespace: vault
type: Opaque
data:
Vault_gcs_key.json: <base64-serviceaccount>
For HashiCorp Vault to function correctly in the specified configuration, we must pass the service account file into the pod and set additional environment variables. To achieve this, we populate the values.yaml file as follows:
server:
enabled: true
extraEnvironmentVars:
GOOGLE_APPLICATION_CREDENTIALS: /vault/userconfig/vault-gcs/vault_gcs_key.json
GOOGLE_PROJECT: your-project
extraVolumes:
- name: vault-gcs
path: /vault/userconfig
type: secret
As we continue to populate the values.yaml file, it's crucial to properly configure the HA setup to support multiple replicas, GCS storage backend, and GCP KMS unseal. Ultimately, the values.yaml should look like this:
server:
enabled: true
extraEnvironmentVars:
GOOGLE_APPLICATION_CREDENTIALS: /vault/userconfig/vault-gcs/vault_gcs_key.json
GOOGLE_PROJECT: your-project
extraVolumes:
- name: vault-gcs
path: /vault/userconfig
type: secret
ha:
config: |
ui = true
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "gcs" {
bucket = "bucket-name"
ha_enabled = "true"
}
seal "gcpckms" {
project = "gcp-project-name"
region = "global"
key_ring = "your-keyring"
crypto_key = "your-cryptokey"
}
enabled: true
replicas: 3
Additionally, our Helm chart includes internal ingress settings, but these are pretty specific and not essential for general understanding. We also disable the use of vault-agent-injector. These parameters can be configured separately based on the examples provided in the official Helm chart documentation.
Deployment to the cluster is managed through Flux CD and is fully automated, using the official Helm chart from HashiCorp.
After a successful deployment, we have three pods named vault-0, vault-1, and vault-2 in the vault namespace. To successfully launch the service, simply execute the following command:
kubectl exec -ti vault-0 -- vault operator init
After running the initialization command, you will receive a root token and five recovery keys. It is crucial to store these details securely, as their loss is critical and effectively prevents access to managing the HashiCorp Vault cluster.
We conducted performance benchmarking of HashiCorp Vault with various storage backends. We tested it with PostgreSQL, Consul, AWS S3, and GCS. In our tests, GCS was approximately ten times slower than PostgreSQL.
Since the speed of secret retrieval is not critical for us, and our priority is the convenience of auto unseal and rapid integration with the storage backend, we decided to use GCS.
Our Kubernetes cluster has been running with HashiCorp Vault in this configuration for two years, and so far, we have not encountered any issues with this setup.
Summary:
HashiCorp Vault functions as our continuously available, centralized, and secure source of truth, providing access control and various authentication methods. It is a user-friendly and configurable tool with up-to-date documentation. Vault integrates seamlessly with the Secret Injection Webhook, offering essential functionality such as:
Creating Secrets: Generate subsequent secrets within the cluster without the need for SOPS encryption, directly importing them from the storage.
Integration: Connects with various tools and platforms, including CI systems, Kubernetes pods, Chef, and more, enabling convenient secret import from the storage.
This makes HashiCorp Vault a valuable component of our infrastructure. We will cover more details about these integrations in the next discussion!