## The problem of having a self-hosted infrastructure
I've been maintaining a personal homelab and self-hosted infrastructure for a few years
now, but one of the most infuriating pages when starting such project is this dreaded
**Warning: Potential Security Risk Ahead** page that appears when you're using a
self-signed certificate, or when trying to use a password on a website or app that is
served through plain HTTP.
![A screenshot of a warning from Firefox indicating that the website that is being accessed is not secure.](/images/dns_article_firefox_warning.png)
While acceptable if you're alone on your own infrastructure or dev environment, this
poses several issues in many other contexts:
- It is not acceptable to publicly expose a website presenting this issue
- It's not advisable to say "hey look, I know that your browser gives you a big red
warning, but it's okay, you can just accept" to friends/family/etc. It's just a very
bad habit to have
- After a while, it really starts to get on your nerve
Thankfully a free solution for that, which is well known by now, has existed
for almost ten (10) years now: [Let's Encrypt and the ACME protocol](https://letsencrypt.org/).
{{<callouttype="note">}}
I promise this is not yet another Let's Encrypt tutorial... Well it is, but for a more
specific use-case
{{</callout>}}
## The Let's Encrypt solution
### What is Let's Encrypt
[Let's Encrypt](https://letsencrypt.org/) is a nonprofit certificate authority founded
in November 2014. Its main goal was to provide an easy and free way to obtain a TLS
certificate in order to make it easy to use HTTPS everywhere.
The [ACME protocol](https://letsencrypt.org/docs/client-options/) developed by Let's
Encrypt is an automated verification system aiming at doing the following:
- verifying that you own the domain for which you want a certificate
- creating and registering that certificate
- delivering the certificate to you
Most client implementation also have an automated renewal system, further reducing the
workload for sysadmins.
The current specification for the ACME protocol proposes two (2) types of challenges
to prove ownership and control over a domain: [HTTP-01](https://letsencrypt.org/docs/challenge-types/#http-01-challenge) and [DNS-01](https://letsencrypt.org/docs/challenge-types/#dns-01-challenge) challenge.
{{<callouttype="note">}}
Actually there are two (2) others: [TLS-SNI-01](https://letsencrypt.org/docs/challenge-types/#tls-sni-01) which is now disabled, and [TLS-ALPN-01](https://letsencrypt.org/docs/challenge-types/#tls-alpn-01) which is only aimed at a very
specific category of users, which we will ignore here.
{{</callout>}}
### The common solution: HTTP challenge
The [HTTP-01](https://letsencrypt.org/docs/challenge-types/#http-01-challenge) challenge
is the most common type of ACME challenge, and will satisfy most use-cases.
![A schema describing the HTTP challenge workflow for the ACME protocol and the interactions between the application server, Let's Encrypt, and the DNS server, all of them public.](/images/dns_article_http_challenge.svg)
For this challenge, we need the following elements :
- A domain name and a record for that domain in a public DNS server (it can be a self-hosted DNS server, our providers', etc)
- Access to a server with a public IP that can be publicly reached
When performing this type of challenge, the following happens (in a very simplified way):
1. The ACME client will ask to start a challenge to the Let's Encrypt API
2. In return, it will get a token
3. It will then either start a standalone server, or edit the configuration for our
current web server (nginx, apache, etc) to serve a file containing the token and a fingerprint of our account key.
4. Let's Encrypt will try to resolve our domain `test.example.com`.
5. If resolution works, then it will check the url `http://test.example.com/.well-known/acme-challenge/<TOKEN>`, and verify that the file from step 3 is served with the correct
content.
If everything works as expected, then the ACME client can download the certificate and key, and we can configure our reverse proxy or server to use this valid certificate,
all is well.
{{<callouttype="help">}}
Okay, but my app contains my accounts, or my proxmox management interface, and I
don't really want to make it public, so how does it work here?
{{</callout>}}
Well it doesn't. For this type of challenge to work, the application server **must** be
public. For this challenge we need to prove that we have control over the application
that uses the target domain (even if we don't control the domain itself). But the
DNS-01 challenge bypasses this limitation.
### When it's not enough: the DNS challenge
As we saw in the previous section, sometimes, for various reasons, the application
server is in a private zone. It must be only reachable from inside a private network,
but we might still want to be able to use a free Let's Encrypt certificate.
For this purpose, the [DNS-01](https://letsencrypt.org/docs/challenge-types/#dns-01-challenge) challenge is based on proving that one has control over the **DNS
server** itself, instead of the application server.
![A schema describing the DNS challenge workflow for the ACME protocol and the interaction between Let's Encrypt, the public DNS server and the private application server](/images/dns_article_dns_challenge_1.svg)
For this type of challenge, the following elements are needed :
- A public DNS server we have control over (can be a self-hosted server, or your DNS provider)
- A ACME client (usually it would be on the application server), it doesn't need to be public
Then, the challenge is done the following way :
1. The ACME client will ask to start a challenge to the Let's Encrypt API.
2. In return, it will get a token.
3. The client then creates a `TXT` record at `_acme-challenge.test.example.com` derived from the token
and the account key.
4. Let's Encrypt will try to resolve the expected `TXT` record, and verify that the content is correct.
If the verification succeeds, we can download your certificate and key, just like the other
type of challenge.
It's important to note that **at no point in time did Let's Encrypt have access to the
application server itself**, because this challenges involves proving that we control
the domain, not that we control the destination of that domain.
If I'm trying to obtain a valid certificate for my Proxmox interface, this is the way I
would want to go, because it would allow me to have a valid certificate, despite my server
not being public at all. So let's see how it works in practice.
## DNS challenge in practice
For this example, I will try to obtain a certificate for my own domain
`test.internal.example.com`. As this name suggests, it is an internal domain and should not
be publicly reachable, so this means I'm going to use a DNS challenge. I don't really want
to use my DNS provider API for this, so I'm going to use a self-hosted [bind](https://www.isc.org/bind/)
server for that.
{{<callouttype="note">}}
The rest of this "guide" will be based on a deployment for a `bind9` server. It can be
adapted to any other type of deployment, but all the configuration snippets are based
on `bind9`. Let's Encrypt has [relevant documentations](https://community.letsencrypt.org/t/dns-providers-who-easily-integrate-with-lets-encrypt-dns-validation/86438) for
other hosting providers.
{{</callout>}}
### Configuring the DNS server
The first step is configuring the DNS server. For this, I'll just use a [bind](https://bind9.readthedocs.io/en/v9.18.27/)
server installed from my usual package manager.
```bash
# example on Debian 12
sudo apt install bind9
```
Most of the configuration happens in the `/etc/bind` directory, mostly in `/etc/bind/named.conf.local`
And now, without any additional change needed, we have a second layer of authentication
for the DNS zone updates. We can go a little further and make sure that the private IPs
themselves are hidden from the outside.
## Bonus 2: completely hiding our private domains from outside
In this post, we implemented our own DNS server (or we used the one from our provider) in
order to resolve internal private hosts, and perform DNS challenges for those hosts in order
to obtain SSL certificates. But this is not entirely satisfying.
For example, we have the following record in our DNS zone:
```text
test A 192.168.1.2
```
This means that running `host test.internal.example.com` (or dig, or any other DNS query tool)
will return `192.168.1.2`, whether you're using your internal DNS, or Google's, or any
other server. This is not great: this IP is private, and should not have any meaning
outside of your network, and, while there wouldn't probably be any impact, publicly
giving the information that you have a private host named `test` on an internal domain,
its IP address (and thus par of your internal infrastructure) isn't great, especially
if you have 10 hosts instead of only one.
For this reason we could use two (2) DNS servers with a different purpose:
- A server inside the private network which would resolve the private hosts
- A server outside the private network, which is only used for the challenges
Indeed, inside our network, we don't really need to be publicly reachable, but we need
name resolution on our local hosts. In the same way, Let's Encrypt doesn't need any
`A` record to perform DNS challenges, it only needs a `TXT` record, so each server
can have its own specific role.
![A schema describing the DNS challenge workflow for the ACME protocol with a separation between a public and a private DNS servers and the interaction between Let's Encrypt and the public DNS server on one side, and the private application server, the user, and the private DNS server on the other side](/images/dns_article_dns_challenge_2.svg)
Basically, what we need is the following:
- a publicly reachable DNS server (the one from the previous parts of this post), that will
have:
- only its own `NS` records
- the TSIG key and rules to update the zone
- optionally, the VPN tunnel
- the `TXT` record to perform the DNS challenges
- a private DNS on your local infrastructure, that will have
- all the `A` (and other types of) DNS records for your internal infrastructure
Let's split the previous configuration (I'll use the one from the [Bonus 1](#bonus-1-adding-a-second-layer-of-authentication-to-connect-to-the-dns) section as an example
### Private DNS server
On the private DNS server, the only thing we need is our local `internal.example.com` zone
definition, so our `named.conf.local` should look like this
```text
zone "internal.example.com" IN {
type master;
file "/var/lib/bind/internal.example.com.zone";
allow-update { none; };
};
```
And our zone definition would look like this
```text
$ORIGIN .
$TTL 7200 ; 2 hours
internal.example.com IN SOA ns.internal.example.com. admin.example.com. (
2024070301 ; serial
3600 ; refresh (1 hour)
600 ; retry (10 minutes)
86400 ; expire (1 day)
600 ; minimum (10 minutes)
)
NS ns.internal.example.com.
$ORIGIN internal.example.com.
ns A 192.168.1.1
test A 192.168.1.2
```
This server should be set as DNS in our DHCP configuration (or in the client
configuration if we don't use DHCP).
### Public DNS server
For the public DNS server, we don't need private `A` records, we just need the
configuration necessary to update the public zone, so our `named.conf.local`
file should look like this (it's the exact same configuration as before)