We want to help remote workers. Extended 90 Day Full-Feature Free Trial: Start Here.
According to a recent study by researchers at North Carolina State University, over 100,000 publicly accessible GitHub repositories contain exposed application secrets directly within their source code. From private API tokens to cryptographic keys, this study – which only scanned approximately 13% of GitHub’s public repositories – indicates that properly securing application secrets is one of the most neglected methods of information security in software today.
While the scale of exposure is surprising, it’s important to note that this problem doesn’t just affect open-source projects. Even private source code repositories can expose secrets if they’re not properly secured. Take, for example, the 2013 security breach at Buffer Inc. What started as illegal access to Buffer’s proprietary source code led to a leak of the company’s Twitter API credentials, ultimately resulting in countless customers’ Twitter accounts getting spammed.
Now, I don’t intend to throw Buffer under the bus here. Companies get hacked every day and Buffer’s response was top-notch. Their unfiltered transparency and incident communication provided a fascinating case study on the importance of secrets management as a core tenet of information security. But, it also raises the question of how best to manage secrets in a growing, scalable organization.
I’m a big fan of HashiCorp. Their approach to vendor-agnostic DevOps tooling provides excellent, portable solutions that abstract away individual cloud providers and focus on solving real problems. Their secrets management tool, Vault, is no exception.
While each individual cloud vendor has its own solution to secrets management, Vault is a provider-agnostic solution that allows you to centrally manage and enforce access to application secrets without regard for the underlying secrets engine or authentication methods.
Before we can get started with Vault, we first need to install it. Like all HashiCorp products, Vault is impressively cross-platform, with support for macOS, Windows, Linux, Solaris, and even the BSDs. Hell, you can even run it on a Raspberry Pi.
Once Vault is installed, we next need to start our server. For the purposes of this article, I’ll be working strictly with the Vault development server. However, it’s important to note that the development server is incredibly insecure and stores all data in memory – meaning that when you restart it, everything will be lost. In the words of HashiCorp themselves:
“The dev server should be used for experimentation with Vault features, such as different auth methods, secrets engines, audit devices, etc.”
To start the development server, simply run the
vault server -dev command (the
-dev indicating that we should be starting the development server, and not a production server):
$ vault server -dev ==> Vault server configuration: Api Address: http://127.0.0.1:8200 Cgo: disabled Cluster Address: https://127.0.0.1:8201 Listener 1: tcp (addr: "127.0.0.1:8200", cluster address: "127.0.0.1:8201", max_request_duration: "1m30s", max_request_size: "33554432", tls: "disabled") Log Level: info Mlock: supported: false, enabled: false Storage: inmem Version: Vault v1.2.1 WARNING! dev mode is enabled! In this mode, Vault runs entirely in-memory and starts unsealed with a single unseal key. The root token is already authenticated to the CLI, so you can immediately begin using Vault. You may need to set the following environment variable: $ export VAULT_ADDR='http://127.0.0.1:8200' The unseal key and root token are displayed below in case you want to seal/unseal the Vault or re-authenticate. Unseal Key: p8MumXfy57bh2T1FxdvZSmHhxqr7aQAByPpfE4PLujk= Root Token: s.aSQmpEYEi5MKelf5TDLPC6r9 Development mode should NOT be used in production installations! ==> Vault server started! Log data will stream in below:
As you can see, a lot of data gets pushed onto the screen for you to play with. The first thing to note is that the development server does not run as a daemon by default (and, for the purposes of testing, shouldn’t ever be run as a daemon). Therefore, if you want to interact with the server, you should first open a second terminal window and export the provided
VAULT_ADDR environment variable so that the
vault command knows which server it should be communicating with.
Unseal Key and
Root Token values are also important to note. While we will touch on what to do with the
Root Token in a later section, understanding Vault sealing/unsealing is critical to properly deploying Vault in a production environment.
In a production environment, a Vault server starts in a sealed state. This means that Vault knows where the data is, but it doesn’t know how to decrypt it. In the development server, the Vault is unsealed by default. If you choose to seal it, however, the
Unseal Key is provided to, well, unseal it. An unsealed Vault stays in that state until it is either re-sealed or the server itself is restarted.
When first starting a production Vault server, it’s important to initialize it. This process will generate the encryption keys and the initial root token, and it can only be run against brand new Vaults without any data:
$ vault operator init Unseal Key 1: 4jYbl2CBIv6SpkKj6Hos9iD32k5RfGkLzlosrrq/JgOm Unseal Key 2: B05G1DRtfYckFV5BbdBvXq0wkK5HFqB9g2jcDmNfTQiS Unseal Key 3: Arig0N9rN9ezkTRo7qTB7gsIZDaonOcc53EHo83F5chA Unseal Key 4: 0cZE0C/gEk3YHaKjIWxhyyfs8REhqkRW/CSXTnmTilv+ Unseal Key 5: fYhZOseRgzxmJCmIqUdxEm9C3jB5Q27AowER9w4FC2Ck Initial Root Token: s.KkNJYWF5g0pomcCLEmDdOVCW Vault initialized with 5 key shares and a key threshold of 3. Please securely distribute the key shares printed above. When the Vault is re-sealed, restarted, or stopped, you must supply at least 3 of these keys to unseal it before it can start servicing requests. Vault does not store the generated master key. Without at least 3 keys to reconstruct the master key, Vault will remain permanently sealed! It is possible to generate new unseal keys, provided you have a quorum of existing unseal keys shares. See "vault operator rekey" for more information.
With a server running, the next thing we need to do is log into it. This can be done with the
vault login command, which will ask for an authentication token. On an initial setup, you can authenticate with the
Root Token (see above). However, in a production environment, the underlying authentication methods can be changed to provide a more fine-grained control over who has access and why:
$ vault login Token (will be hidden): Success! You are now authenticated. The token information displayed below is already stored in the token helper. You do NOT need to run "vault login" again. Future Vault requests will automatically use this token. Key Value --- ----- token s.aSQmpEYEi5MKelf5TDLPC6r9 token_accessor MaJhao2R54EdV9fDq7sL11d4 token_duration ∞ token_renewable false token_policies ["root"] identity_policies  policies ["root"]
While HashiCorp’s Vault can be used to securely store just about any kind of data, the most common use case for Vault is as a key-value store for application secrets. Once authenticated, storing secrets is incredibly straightforward thanks to the
vault kv put command:
$ vault kv put secret/foo bar=baz Key Value --- ----- created_time 2019-08-09T16:43:10.604124Z deletion_time n/a destroyed false version 1
To break down the above command and response a little bit, we created a new secret called
foo in the
secret namespace with a value of
bar=baz., the response gives us some basic metadata about our new secret. While the
destroyed keys are pretty self-explanatory, you should take special note of the
version key, because it implies that secrets can be versioned.
For example, let’s see what happens if we
put a new value for the same secret:
$ vault kv put secret/foo bat=ball Key Value --- ----- created_time 2019-08-09T16:43:32.638788Z deletion_time n/a destroyed false version 2
See how the version metadata key was incremented? This means that our original value must be maintained in addition to the new values, which provides an excellent audit log of what secrets get changed and when.
Now, storing secrets is only half the battle. The other half is actually retrieving those secrets. Given our example above, let’s first take a look at how to retrieve our entire list of secrets:
$ vault kv list secret Keys ---- foo
As you can see, while we technically put two secrets, only one key is being tracked because those two secrets are really just two versions of a single secret. To retrieve it, execute the
vault kv get command with the secret namespace and key:
$ vault kv get secret/foo ====== Metadata ====== Key Value --- ----- created_time 2019-08-09T16:43:32.638788Z deletion_time n/a destroyed false version 2 === Data === Key Value --- ----- bat ball
By default, Vault will retrieve the most recent version of a secret but if we want to retrieve a previous version, the
-version directive can be used:
$ vault kv get -version=1 secret/foo ====== Metadata ====== Key Value --- ----- created_time 2019-08-09T16:43:10.604124Z deletion_time n/a destroyed false version 1 === Data === Key Value --- ----- bar baz
The value of version-controlled secrets is incredible as it allows internal services to lock themselves to different secret versions, making it possible to gradually evolve, release and rollback application changes without fear of losing important data.
Now, despite the benefits of version control, it can quickly become necessary to actually remove a secret (or a version of one). There are two methods to do this, depending on how “removed” you want the secret to be:
destroy. To illustrate, let’s first take a look at deleting a version of our
$ vault kv delete -versions=1 secret/foo Success! Data deleted (if it existed) at: secret/foo
This marks the data as
deleted and prevents its retrieval in normal GET requests, but it doesn’t actually remove the data:
$ vault kv get -version=1 secret/foo ====== Metadata ====== Key Value --- ----- created_time 2019-08-09T16:43:10.604124Z deletion_time 2019-08-09T16:45:39.664577Z destroyed false version 1
In order for the data to actually be removed beyond recovery, the
destroy command must be used:
$ vault kv destroy -versions=1 secret/foo Success! Data written to: secret/destroy/foo
Instead of simply marking the data as deleted and limiting access to it, the
destroy command will remove it entirely, making subsequent retrieval impossible:
$ vault kv get -version=1 secret/foo ====== Metadata ====== Key Value --- ----- created_time 2019-08-09T16:43:10.604124Z deletion_time 2019-08-09T16:45:39.664577Z destroyed true version 1
Vault is a complicated tool and managing secrets like this is only a tiny fraction of what can be done with it. While the finer points of Vault are well beyond the scope of this article, let’s touch upon just a few other concepts that make Vault so powerful.
$ vault secrets enable database Success! Enabled the database secrets engine at: database/
Vault’s default key-value store is an example of a secrets engine (specifically, an engine called
kv). At its core, a secrets sngine is an abstracted storage mechanism for secrets data. This means that, instead of a key-value based storage mechanism, more targeted storage mechanisms can be used. For example, the database secrets engine can be used to dynamically generate
database credentials based on configured roles for MySQL and MariaDB, allowing for automated root credential rotation or even temporary access credentials on-demand.
$ vault auth enable github Success! Enabled github auth method at: github/
In addition to the standard token-based authentication method, Vault supports a number of additional authentication methods to better support your use cases. A great example of this is the GitHub authentication method, which can be used to automatically provide Vault access to developers that belong to a specific GitHub organization – and even a specific team within a GitHub organization – using only a personal access token. For larger organizations, enterprise-level single sign-on solutions like LDAP or Okta can be used to authenticate users to a Vault.
$ vault write auth/userpass/users/test policies="dev-readonly,logs"
Authorization always goes hand-in-hand with authentication. While it’s easy to provide global access using GitHub or Token-based authentication, it’s almost never a complete solution. Thanks to Vault’s Policies, an RBAC-style authorization method can be implemented, giving different users and groups CRUD-like access to different facets of the vault itself. Combined with one of the more advanced authentication methods, this can become an incredibly powerful tool for fine-grained access controls within a large organization.
As powerful as Vault is, getting it right can be difficult. While the size and scope of the various authentication methods and secrets engines make it clear just how much you can do with Vault, it can be hard to wrap your head around basic secrets management in the context of source code information security. Thanks to an impressively large number of both official and community API libraries, retrieving secrets in a secure manner is incredibly easy, and if you aspire to become a Vault power user, HashiCorp’s own Vault Curriculum is an excellent place to start.
In addition to application and infrastructure security, you need a plan for responding to incidents quickly. Check out our free guide, From Reactive to Proactive: 6 Ways to Transform Your Monitoring and Incident Response to build highly colllaborative, transparent incident management workflows.
Zachary Flower (@zachflower) is a Fixate IO Contributor, principal engineer at Automox—a Boulder-based patch management company—and freelance writer. With a passion for simplicity and usability within the development pipeline, Zach puts a strong emphasis on the importance of documentation, developer productivity, and shift-left testing strategies.