Homeβ€Ί DSAβ€Ί PKI Explained: How Certificates Actually Work in Production

PKI Explained: How Certificates Actually Work in Production

Where developers are forged. Β· Structured learning Β· Free forever.
πŸ“ Part of: Cryptography β†’ Topic 10 of 10
Public Key Infrastructure explained for engineers: how certificate chains, trust stores, and TLS handshakes work β€” and the exact ways they fail in production.
βš™οΈ Intermediate β€” basic DSA knowledge assumed
In this tutorial, you'll learn:
  • A certificate is a signed claim, not a secret β€” its value comes entirely from the trustworthiness of the CA that signed it, which is why a misconfigured trust store is a far bigger security risk than a weak password on the keystore file itself.
  • Missing Intermediate CA certificate in your server config is the single most common PKI incident in production β€” it passes all your own tests because your tooling does AIA fetching, then silently breaks every mobile client in production.
  • Reach for short-lived certificates (24-hour TTL) with automated rotation the moment you're managing more than a handful of internal service identities β€” OCSP and CRL are the right answer to the wrong question at that scale.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
⚑ Quick Answer
Imagine every business in a city gets a laminated ID badge from the mayor's office. When you walk into a shop, you don't know the shopkeeper personally β€” but you trust the mayor, and the badge proves the mayor vouched for them. PKI is that exact system for the internet: a chain of vouching, anchored at a root everyone agrees to trust. The twist is that your browser doesn't trust 'the internet' β€” it trusts a specific, hardcoded list of mayors baked into your OS. If your mayor isn't on that list, the whole thing collapses, no matter how legitimate your badge is.

A fintech startup I consulted for lost six hours of payment processing because a certificate issued by their internal CA expired at 2:47 AM on a Tuesday. Their monitoring caught nothing β€” the service didn't crash, it just silently rejected every TLS handshake with PKIX path building failed: unable to find valid certification path to requested target. Six engineers stared at perfectly healthy application logs while $340k in transactions queued. PKI didn't fail loudly. It failed quietly, at the edges, in a way nobody had written a runbook for.

PKI β€” Public Key Infrastructure β€” is the trust plumbing underneath every HTTPS connection, every signed JWT, every mTLS service mesh, and every code-signing pipeline you've ever touched. It answers one deceptively hard question: how do two strangers on a network prove to each other that they are who they claim to be, without having met before? Symmetric keys don't scale β€” you can't pre-share a secret with every website on the internet. PKI solves this with asymmetric cryptography layered over a hierarchy of trusted authorities. Get it right and it's invisible. Get it wrong and you're the 3 AM war room.

After this you'll be able to read a certificate chain and understand exactly what each field means and why it matters, trace a TLS handshake step by step and know where it can break, build and rotate certificates in a real service without downtime, debug the six most common certificate errors without guessing, and design an internal PKI for a microservices environment that won't bite you six months later.

Asymmetric Cryptography: The Math That Makes Trust Possible

Before PKI existed, encrypting traffic between two servers meant pre-sharing a secret key out-of-band β€” email it, phone it in, bake it into a config file checked into git (yes, this still happens). That doesn't scale, and it means anyone who intercepts the key exchange owns all past and future traffic. The entire premise of PKI is that you can publish a key openly, and doing so doesn't compromise you.

Asymmetric cryptography gives you a key pair: a public key you broadcast freely and a private key you guard with your life. Anything encrypted with the public key can only be decrypted by the matching private key. More importantly for PKI, anything signed with the private key can be verified by anyone holding the public key β€” without the verifier ever touching the private key. That second property is what makes certificates work.

A certificate is just a structured document that says: 'This public key belongs to example.com, and I, DigiCert, am signing this claim with my own private key.' Your browser holds DigiCert's public key (via the trust store), verifies DigiCert's signature, and concludes the public key genuinely belongs to example.com. The private key for example.com never travels over the wire. Not once. That's the whole trick.

RSA-2048 was the default for years. Today you should default to ECDSA P-256 β€” smaller keys, faster handshakes, equivalent or better security. RSA-4096 is not meaningfully more secure than RSA-2048 against current threats, but it's noticeably slower. Don't reach for it unless a compliance checkbox forces your hand.

AsymmetricSigningDemo.java Β· JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
package io.thecodeforge.dsa;

import java.security.*;
import java.security.spec.ECGenParameterSpec;
import java.util.Base64;

/**
 * Demonstrates the asymmetric signing primitive that underlies every
 * certificate verification in PKI. This is NOT a full PKI implementation β€”
 * it's the cryptographic foundation you need to understand before certificates
 * make sense.
 *
 * Production context: a payment gateway signing webhook payloads so the
 * receiving merchant can verify the payload wasn't tampered with in transit.
 */
public class AsymmetricSigningDemo {

    public static void main(String[] args) throws Exception {

        // --- KEY GENERATION ---
        // ECDSA with P-256 curve: the modern default. Prefer this over RSA-2048
        // for new systems. Smaller key, faster ops, same effective security.
        KeyPairGenerator keyPairGenerator = KeyPairGenerator.getInstance("EC");
        keyPairGenerator.initialize(new ECGenParameterSpec("secp256r1"), new SecureRandom());
        KeyPair gatewayKeyPair = keyPairGenerator.generateKeyPair();

        PublicKey  gatewayPublicKey  = gatewayKeyPair.getPublic();
        PrivateKey gatewayPrivateKey = gatewayKeyPair.getPrivate();

        // Simulate: gateway publishes its public key to merchants during onboarding.
        // This key is not secret β€” it's meant to be distributed.
        System.out.println("Gateway Public Key (share this openly):");
        System.out.println(Base64.getEncoder().encodeToString(gatewayPublicKey.getEncoded()));
        System.out.println();

        // --- SIGNING (happens inside the gateway before dispatching webhook) ---
        String webhookPayload = "{\"event\":\"payment.captured\",\"amount\":4999,\"currency\":\"GBP\"}";

        Signature signer = Signature.getInstance("SHA256withECDSA");
        signer.initSign(gatewayPrivateKey); // private key NEVER leaves this service
        signer.update(webhookPayload.getBytes());
        byte[] signature = signer.sign();

        String encodedSignature = Base64.getEncoder().encodeToString(signature);
        System.out.println("Webhook payload : " + webhookPayload);
        System.out.println("Signature (send in X-Gateway-Signature header): " + encodedSignature);
        System.out.println();

        // --- VERIFICATION (happens inside the merchant's webhook handler) ---
        // The merchant only needs the public key β€” never touches the private key.
        Signature verifier = Signature.getInstance("SHA256withECDSA");
        verifier.initVerify(gatewayPublicKey); // public key used for verification
        verifier.update(webhookPayload.getBytes());

        boolean isAuthentic = verifier.verify(Base64.getDecoder().decode(encodedSignature));
        System.out.println("Signature valid? " + isAuthentic); // true: payload is genuine

        // --- TAMPER DETECTION ---
        // Simulate a man-in-the-middle modifying the amount
        String tamperedPayload = "{\"event\":\"payment.captured\",\"amount\":1,\"currency\":\"GBP\"}";

        Signature tamperedVerifier = Signature.getInstance("SHA256withECDSA");
        tamperedVerifier.initVerify(gatewayPublicKey);
        tamperedVerifier.update(tamperedPayload.getBytes()); // different bytes β†’ different hash

        boolean tamperedResult = tamperedVerifier.verify(Base64.getDecoder().decode(encodedSignature));
        System.out.println("Tampered payload valid? " + tamperedResult); // false: tampering detected
    }
}
β–Ά Output
Gateway Public Key (share this openly):
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE[...base64 encoded public key...]

Webhook payload : {"event":"payment.captured","amount":4999,"currency":"GBP"}
Signature (send in X-Gateway-Signature header): MEYCIQDn[...base64 encoded signature...]

Signature valid? true
Tampered payload valid? false
⚠️
Never Do This: Signing the Hash YourselfDon't call MessageDigest.digest() then pass the result to Signature.update(). SHA256withECDSA already hashes internally β€” you'll be signing a hash of a hash, and verification will silently fail every time. The symptom is a valid-looking signature that always returns false on verify. Use the combined algorithm string (SHA256withECDSA, SHA256withRSA) and pass the raw payload bytes.

Certificate Chains and Trust Stores: Why Your Browser Trusts Anything at All

Here's what most explanations skip: a certificate by itself proves nothing. I could generate a certificate right now that says 'This is google.com' β€” takes about 30 seconds with OpenSSL. The certificate is cryptographically valid. But it's meaningless unless someone your browser already trusts has signed it.

This is the chain of trust. Every certificate is signed by a Certificate Authority (CA). That CA's certificate is signed by a Root CA. Root CA certificates are self-signed β€” they vouch for themselves β€” and they're valuable precisely because your OS vendor (Microsoft, Apple, Linux distro maintainers) manually vetted them and baked them into the trust store. On a JVM, that's the cacerts file inside your JRE. On Linux it's /etc/ssl/certs. On macOS it's the Keychain. These are the hardcoded mayors.

In practice, Root CAs don't sign end-entity certificates directly β€” the Root CA private key is kept offline in a literal hardware vault. Instead they sign Intermediate CA certificates, which do the day-to-day signing. This is deliberate: if an Intermediate gets compromised, you revoke that Intermediate without touching the Root. The chain looks like: Root CA β†’ Intermediate CA β†’ Your Certificate.

When your service presents a certificate during a TLS handshake, it must send the full chain β€” its own certificate plus every Intermediate. The Root is omitted because the client already has it in the trust store. Miss an Intermediate and you get the SSL_ERROR_RX_RECORD_TOO_LONG error that burns junior engineers for half a day. It's not the record β€” it's an incomplete chain causing an unexpected handshake failure.

CertificateChainInspector.java Β· JAVA
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485
package io.thecodeforge.dsa;

import javax.net.ssl.*;
import java.io.FileInputStream;
import java.security.KeyStore;
import java.security.cert.*;
import java.util.Arrays;

/**
 * Production utility: inspect the certificate chain returned by a live TLS
 * endpoint. Run this against your own services during CI to catch chain
 * issues before they reach production.
 *
 * Real-world use: added to a fintech deployment pipeline after an Intermediate
 * CA cert was accidentally omitted from the Nginx config, causing mobile clients
 * (which don't do AIA fetching) to fail while desktop browsers succeeded.
 */
public class CertificateChainInspector {

    public static void main(String[] args) throws Exception {
        String targetHost = "api.example.com"; // replace with your service host
        int    targetPort = 443;

        // Build an SSL context backed by the default JVM trust store (cacerts).
        // This reflects exactly what your service-to-service calls will see.
        SSLContext sslContext = SSLContext.getDefault();
        SSLSocketFactory socketFactory = sslContext.getSocketFactory();

        // Open the TLS connection and capture the certificate chain presented
        // by the server β€” this is what the handshake actually receives.
        try (SSLSocket sslSocket = (SSLSocket) socketFactory.createSocket(targetHost, targetPort)) {
            sslSocket.startHandshake(); // triggers the full TLS handshake

            SSLSession session = sslSocket.getSession();
            Certificate[] peerCertificates = session.getPeerCertificates();

            System.out.printf("Chain length: %d (should be 2 or 3 β€” if 1, Intermediate is missing)%n",
                    peerCertificates.length);
            System.out.println();

            // Index 0 is always the leaf (end-entity) certificate.
            // Index 1 is the Intermediate CA.
            // Index 2 (if present) is a second Intermediate or the Root.
            for (int i = 0; i < peerCertificates.length; i++) {
                X509Certificate cert = (X509Certificate) peerCertificates[i];

                System.out.printf("=== Certificate [%d] ===%n", i);
                System.out.println("Subject : " + cert.getSubjectX500Principal().getName());
                System.out.println("Issuer  : " + cert.getIssuerX500Principal().getName());
                System.out.println("Valid from : " + cert.getNotBefore());
                System.out.println("Expires    : " + cert.getNotAfter()); // track this in alerting

                // Key usage tells you what this certificate is authorised to do.
                // End-entity certs should NOT have keyCertSign set β€” only CAs should.
                boolean[] keyUsage = cert.getKeyUsage();
                if (keyUsage != null) {
                    System.out.println("Key Usage  : " + Arrays.toString(keyUsage));
                    // keyUsage[5] == true means keyCertSign β€” a red flag on a leaf cert
                    if (keyUsage[5]) {
                        System.out.println("  ⚠ WARNING: keyCertSign is set β€” this cert can sign other certs");
                    }
                }

                // SAN (Subject Alternative Names) is what modern TLS checks for hostname matching.
                // CN matching was deprecated in RFC 2818 β€” if there's no SAN, expect errors in
                // Chrome 58+ and Java 8u181+ with the 'No subject alternative names present' message.
                try {
                    Collection<List<?>> sans = cert.getSubjectAlternativeNames();
                    if (sans != null) {
                        System.out.println("SANs :");
                        for (List<?> san : sans) {
                            // Type 2 = dNSName, Type 7 = iPAddress
                            System.out.println("  type=" + san.get(0) + " value=" + san.get(1));
                        }
                    } else {
                        System.out.println("  ⚠ WARNING: No SANs β€” hostname validation will fail in modern clients");
                    }
                } catch (CertificateParsingException e) {
                    System.out.println("  Could not parse SANs: " + e.getMessage());
                }
                System.out.println();
            }
        }
    }
}
β–Ά Output
Chain length: 3 (should be 2 or 3 β€” if 1, Intermediate is missing)

=== Certificate [0] ===
Subject : CN=api.example.com
Issuer : CN=DigiCert TLS RSA SHA256 2020 CA1, O=DigiCert Inc, C=US
Valid from : Mon Jan 15 00:00:00 UTC 2024
Expires : Wed Feb 12 23:59:59 UTC 2025
Key Usage : [true, false, false, false, false, false, false, false, false]
SANs :
type=2 value=api.example.com
type=2 value=www.api.example.com

=== Certificate [1] ===
Subject : CN=DigiCert TLS RSA SHA256 2020 CA1, O=DigiCert Inc, C=US
Issuer : CN=DigiCert Global Root CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Valid from : Wed Sep 23 00:00:00 UTC 2020
Expires : Mon Sep 22 23:59:59 UTC 2030
Key Usage : [true, false, true, false, false, true, false, false, false]
⚠ WARNING: keyCertSign is set β€” this cert can sign other certs

=== Certificate [2] ===
Subject : CN=DigiCert Global Root CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Issuer : CN=DigiCert Global Root CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Valid from : Fri Nov 10 00:00:00 UTC 2006
Expires : Mon Nov 10 00:00:00 UTC 2031
⚠️
Production Trap: Mobile Clients Don't Do AIA FetchingDesktop browsers silently recover from a missing Intermediate by fetching it via the Authority Information Access (AIA) extension in your leaf cert. Android and iOS apps do not β€” they fail immediately with a hard handshake error. This creates a nightmare where your service works fine in Postman and Chrome, but your mobile app crashes for 100% of users. Always verify your chain length is β‰₯ 2 using the CertificateChainInspector pattern above before any production deployment.

mTLS and Internal PKI: Certificate Management in a Real Microservices Mesh

Server-side TLS proves the server is who it claims to be. Mutual TLS (mTLS) goes both ways β€” the client also presents a certificate, and the server validates it. This is the right authentication model for service-to-service communication inside a microservices architecture. No shared secrets, no API keys rotated manually, no JWT signing keys sitting in environment variables.

The problem is that mTLS at scale requires issuing, rotating, and revoking potentially thousands of short-lived certificates β€” one per service identity. Managing this by hand is how you end up with a spreadsheet of certificate expiry dates that someone stops updating six months in. The answer is an internal CA, and the operational answer to the scale problem is automating issuance using something like HashiCorp Vault PKI Secrets Engine or cert-manager on Kubernetes.

Short-lived certificates are the modern answer to revocation. Traditional CRL (Certificate Revocation List) and OCSP (Online Certificate Status Protocol) are both operationally painful β€” CRLs are downloaded periodically and can be stale, OCSP requires a real-time round-trip that adds latency and creates a liveness dependency. If you issue certificates with a 24-hour TTL and rotate them automatically, a compromised certificate expires before you've finished your incident response. Revocation becomes a non-issue.

I've seen teams fight for months trying to get OCSP stapling right on Nginx inside a Kubernetes cluster, only to eventually migrate to 24-hour cert rotation via Vault and throw away the entire OCSP infrastructure. Don't start with OCSP for internal PKI. Start with short TTLs and automated rotation.

MutualTlsServiceClient.java Β· JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899
package io.thecodeforge.dsa;

import javax.net.ssl.*;
import java.io.FileInputStream;
import java.net.URI;
import java.net.http.*;
import java.security.KeyStore;
import java.time.Duration;

/**
 * Production pattern: an order-processing service calling an inventory service
 * using mTLS. Both services have certificates issued by the same internal CA.
 * The calling service presents its own certificate β€” the server validates it
 * against the internal CA trust store, not a public CA.
 *
 * Prerequisite: generate keystores with:
 *   keytool -genkeypair -alias order-service -keyalg EC -groupname secp256r1
 *     -keystore order-service-keystore.p12 -storetype PKCS12 -validity 1
 *   (validity 1 day β€” short-lived certificates, rotated via Vault or cert-manager)
 *
 * The internal CA cert goes into the truststore:
 *   keytool -import -alias internal-ca -file internal-ca.crt
 *     -keystore internal-ca-truststore.p12 -storetype PKCS12
 */
public class MutualTlsServiceClient {

    // Paths to keystores β€” in production these come from Vault agent injection
    // or a Kubernetes secret mounted as a volume, NOT from environment variables.
    private static final String KEYSTORE_PATH   = "/etc/certs/order-service-keystore.p12";
    private static final String TRUSTSTORE_PATH = "/etc/certs/internal-ca-truststore.p12";

    // In production: read from Vault or a secrets manager, never hardcoded.
    private static final char[] KEYSTORE_PASSWORD   = "changeit".toCharArray();
    private static final char[] TRUSTSTORE_PASSWORD = "changeit".toCharArray();

    public static void main(String[] args) throws Exception {
        HttpClient mtlsClient = buildMtlsHttpClient();

        HttpRequest inventoryRequest = HttpRequest.newBuilder()
                .uri(URI.create("https://inventory-service.internal:8443/api/v1/stock/SKU-99821"))
                .timeout(Duration.ofSeconds(5))
                .header("Accept", "application/json")
                .GET()
                .build();

        HttpResponse<String> response = mtlsClient.send(
                inventoryRequest,
                HttpResponse.BodyHandlers.ofString()
        );

        System.out.println("Status : " + response.statusCode());
        System.out.println("Body   : " + response.body());
    }

    private static HttpClient buildMtlsHttpClient() throws Exception {

        // --- KEYSTORE: our identity ---
        // Contains the order-service's private key and its certificate.
        // Presented to the inventory service during the TLS handshake.
        KeyStore identityKeystore = KeyStore.getInstance("PKCS12");
        try (FileInputStream keystoreStream = new FileInputStream(KEYSTORE_PATH)) {
            identityKeystore.load(keystoreStream, KEYSTORE_PASSWORD);
        }

        KeyManagerFactory keyManagerFactory = KeyManagerFactory.getInstance(
                KeyManagerFactory.getDefaultAlgorithm() // SunX509
        );
        keyManagerFactory.init(identityKeystore, KEYSTORE_PASSWORD);

        // --- TRUSTSTORE: who we trust ---
        // Contains only the internal CA certificate, NOT the public CA bundle.
        // This prevents any publicly-trusted cert from impersonating an internal service.
        KeyStore internalTruststore = KeyStore.getInstance("PKCS12");
        try (FileInputStream truststoreStream = new FileInputStream(TRUSTSTORE_PATH)) {
            internalTruststore.load(truststoreStream, TRUSTSTORE_PASSWORD);
        }

        TrustManagerFactory trustManagerFactory = TrustManagerFactory.getInstance(
                TrustManagerFactory.getDefaultAlgorithm() // PKIX
        );
        trustManagerFactory.init(internalTruststore); // scoped to internal CA only

        // --- SSL CONTEXT ---
        // TLSv1.3 only. Drop TLS 1.0 and 1.1 β€” they're deprecated and broken.
        // TLS 1.2 is acceptable if you have legacy services that don't support 1.3 yet.
        SSLContext sslContext = SSLContext.getInstance("TLSv1.3");
        sslContext.init(
                keyManagerFactory.getKeyManagers(),   // our certificate presented to server
                trustManagerFactory.getTrustManagers(), // CAs we accept server certs from
                null // SecureRandom: null means JVM default (acceptable for most cases)
        );

        return HttpClient.newBuilder()
                .sslContext(sslContext)
                .connectTimeout(Duration.ofSeconds(3))
                .version(HttpClient.Version.HTTP_2) // HTTP/2 over TLS 1.3 β€” no overhead
                .build();
    }
}
β–Ά Output
Status : 200
Body : {"sku":"SKU-99821","available":142,"reserved":18,"warehouse":"LHR-1"}
⚠️
Senior Shortcut: Scope Your Truststore to Internal CA OnlyWhen building the SSLContext for internal mTLS, initialize your TrustManagerFactory with your internal CA truststore β€” not the default system truststore. If you use the system default, any certificate signed by any public CA (DigiCert, Let's Encrypt, etc.) can successfully authenticate to your internal service. That's a serious misconfiguration. Pin the trust explicitly to your internal CA, and an attacker with a legitimate public cert can't impersonate your inventory service.

Certificate Rotation Without Downtime: The Operational Part Nobody Teaches

Getting PKI conceptually right is the easy part. Operating it without 3 AM incidents is where engineers earn their keep. Certificates expire β€” that's not a bug, it's the security model working as intended. Your job is to make rotation invisible to traffic.

The biggest operational mistake I see is treating certificate rotation as a one-time manual task. It is not. In a production system with dozens of services, certificate rotation must be automated and observable. Every certificate in your infrastructure needs an expiry date in your monitoring system, with alerts at 30 days, 14 days, and 7 days. The cert that killed that fintech's payments? Nobody had an alert. They found out from Stripe's webhook delivery failure logs.

The secret to zero-downtime rotation is dual-cert support during the transition window. Load balancers and web servers support presenting a new certificate while the old one is still valid β€” the server picks based on the SNI hostname from the client. During rotation, both certs are active simultaneously. Traffic gradually cuts over as clients reconnect. Once the old cert is within 24 hours of expiry with near-zero active sessions, you pull it.

For services using mTLS client certificates, rotation is trickier because both sides must trust both old and new certificates during the window. The server's truststore needs to contain both the old client cert and the new one during rotation. Automate this with cert-manager's certificate rotation hooks or Vault's PKI rotate endpoint, and build a /health/cert endpoint into every service that returns expiry dates β€” make it part of your readiness probe.

CertificateExpiryHealthIndicator.java Β· JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115
package io.thecodeforge.dsa;

import java.io.FileInputStream;
import java.security.KeyStore;
import java.security.cert.X509Certificate;
import java.time.Duration;
import java.time.Instant;
import java.util.*;

/**
 * Production health check: expose certificate expiry for every cert in the
 * service's keystore as a structured health endpoint.
 *
 * Wire this into Spring Boot Actuator, Dropwizard HealthCheck, or your custom
 * /health endpoint. Feed the output into your APM (Datadog, New Relic, etc.)
 * as a custom metric: cert.days_until_expiry, tagged by alias.
 *
 * Rule: if any cert expires within CRITICAL_THRESHOLD_DAYS, the health check
 * returns DEGRADED and PagerDuty fires. This is non-negotiable.
 */
public class CertificateExpiryHealthIndicator {

    // Alert at 30 days warning, page at 7 days critical.
    // These numbers came from a post-incident review β€” 14 days wasn't enough
    // buffer for the procurement cycle at one enterprise client.
    private static final long WARNING_THRESHOLD_DAYS  = 30;
    private static final long CRITICAL_THRESHOLD_DAYS = 7;

    public static void main(String[] args) throws Exception {
        String keystorePath     = "/etc/certs/order-service-keystore.p12";
        char[] keystorePassword = "changeit".toCharArray();

        List<CertificateExpiryReport> report = inspectKeystore(keystorePath, keystorePassword);

        HealthStatus overallStatus = HealthStatus.HEALTHY;

        for (CertificateExpiryReport entry : report) {
            System.out.printf("Alias: %-30s | Expires: %s | Days remaining: %d | Status: %s%n",
                    entry.alias(), entry.expiresAt(), entry.daysRemaining(), entry.status());

            // Escalate overall status to the worst individual status
            if (entry.status() == HealthStatus.CRITICAL) {
                overallStatus = HealthStatus.CRITICAL;
            } else if (entry.status() == HealthStatus.WARNING && overallStatus != HealthStatus.CRITICAL) {
                overallStatus = HealthStatus.WARNING;
            }
        }

        System.out.println();
        System.out.println("Overall certificate health: " + overallStatus);

        // In production: if CRITICAL, throw an exception to fail the readiness probe.
        // This prevents Kubernetes from sending traffic to a service with an expired cert
        // that will cause every downstream mTLS handshake to fail.
        if (overallStatus == HealthStatus.CRITICAL) {
            throw new CertificateCriticalException(
                    "One or more certificates expire within " + CRITICAL_THRESHOLD_DAYS + " days. Rotation required immediately."
            );
        }
    }

    private static List<CertificateExpiryReport> inspectKeystore(
            String keystorePath, char[] password) throws Exception {

        KeyStore keystore = KeyStore.getInstance("PKCS12");
        try (FileInputStream inputStream = new FileInputStream(keystorePath)) {
            keystore.load(inputStream, password);
        }

        List<CertificateExpiryReport> reports = new ArrayList<>();
        Enumeration<String> aliases = keystore.aliases();

        while (aliases.hasMoreElements()) {
            String alias = aliases.nextElement();

            // Only inspect certificate entries β€” skip private key entries without certs
            if (!keystore.isCertificateEntry(alias) && !keystore.isKeyEntry(alias)) continue;

            X509Certificate cert = (X509Certificate) keystore.getCertificate(alias);
            if (cert == null) continue;

            Instant expiresAt = cert.getNotAfter().toInstant();
            long daysRemaining = Duration.between(Instant.now(), expiresAt).toDays();

            HealthStatus status;
            if (daysRemaining <= 0) {
                status = HealthStatus.EXPIRED; // already dead
            } else if (daysRemaining <= CRITICAL_THRESHOLD_DAYS) {
                status = HealthStatus.CRITICAL;
            } else if (daysRemaining <= WARNING_THRESHOLD_DAYS) {
                status = HealthStatus.WARNING;
            } else {
                status = HealthStatus.HEALTHY;
            }

            reports.add(new CertificateExpiryReport(alias, expiresAt, daysRemaining, status));
        }

        return reports;
    }

    // Structured report per certificate entry in the keystore
    record CertificateExpiryReport(
            String alias,
            Instant expiresAt,
            long daysRemaining,
            HealthStatus status
    ) {}

    enum HealthStatus { HEALTHY, WARNING, CRITICAL, EXPIRED }

    static class CertificateCriticalException extends RuntimeException {
        CertificateCriticalException(String message) { super(message); }
    }
}
β–Ά Output
Alias: order-service | Expires: 2025-02-12T23:59:59Z | Days remaining: 6 | Status: CRITICAL
Alias: internal-ca | Expires: 2030-09-22T23:59:59Z | Days remaining: 2057 | Status: HEALTHY

Overall certificate health: CRITICAL
Exception in thread "main" io.thecodeforge.dsa.CertificateExpiryHealthIndicator$CertificateCriticalException: One or more certificates expire within 7 days. Rotation required immediately.
πŸ”₯
Interview Gold: Why Short-Lived Certs Beat Revocation ListsCRL and OCSP both solve the same problem: what happens when a private key is compromised before the cert expires? The answer is to revoke it. But CRL distribution points can lag by hours, and OCSP adds a real-time network dependency to every TLS handshake. Short-lived certificates (24-hour TTL, auto-rotated) sidestep the problem entirely β€” a compromised key expires before most incident responses are complete. This is why Kubernetes service mesh projects like Istio default to 24-hour certificate lifetimes, not because revocation is hard to implement, but because it's the wrong solution to the problem.
AspectPublic CA (DigiCert, Let's Encrypt)Internal CA (Vault, cert-manager)
Trust scopeTrusted by all browsers and OS trust stores globallyTrusted only by systems you configure β€” internal services only
Certificate costLet's Encrypt: free; DigiCert OV/EV: $100-$1000+/yearInfrastructure cost only β€” issuance itself is free at scale
Issuance speedLet's Encrypt: seconds (ACME); DV OV: minutes to daysMilliseconds via Vault API or cert-manager CertificateRequest
Revocation mechanismCRL + OCSP β€” publicly accessible, some lagShort TTL + immediate Vault lease revocation β€” no lag
Wildcard certificatesSupported (DNS-01 ACME challenge required for Let's Encrypt)Supported β€” but short-lived per-service certs are safer
Suitable for mTLS client authTechnically yes β€” operationally painful at scaleYes β€” the correct tool for service-to-service identity
Certificate lifetime90 days (Let's Encrypt); 1 year max per CA/Browser ForumConfigurable β€” recommended 24 hours for internal services
Key compromise responseSubmit revocation request, wait for CRL/OCSP propagationRevoke Vault lease immediately β€” expires naturally within TTL
Compliance (PCI, SOC2)Required for public-facing HTTPS; auditors expect public CAAcceptable for internal traffic β€” document your CA controls
Observability toolingSSL Labs, crt.sh, browser DevToolsVault audit log, cert-manager Prometheus metrics, custom health checks

🎯 Key Takeaways

  • A certificate is a signed claim, not a secret β€” its value comes entirely from the trustworthiness of the CA that signed it, which is why a misconfigured trust store is a far bigger security risk than a weak password on the keystore file itself.
  • Missing Intermediate CA certificate in your server config is the single most common PKI incident in production β€” it passes all your own tests because your tooling does AIA fetching, then silently breaks every mobile client in production.
  • Reach for short-lived certificates (24-hour TTL) with automated rotation the moment you're managing more than a handful of internal service identities β€” OCSP and CRL are the right answer to the wrong question at that scale.
  • The JVM's cacerts trust store is not static infrastructure β€” it gets updated with every JRE release, which means upgrading your JRE can silently remove trust for a Root CA your internal services depend on. Pin your internal CA explicitly in a separate truststore and never rely on it being in cacerts.

⚠ Common Mistakes to Avoid

  • βœ•Mistake 1: Configuring the JVM HttpClient or OkHttp with the default system trust store for internal mTLS calls β€” symptom: PKIX path building failed: unable to find valid certification path to requested target even though the internal CA cert is clearly present β€” fix: create a custom SSLContext with TrustManagerFactory initialized against your internal CA truststore, not TrustManagerFactory.getDefaultAlgorithm() against the default keystore
  • βœ•Mistake 2: Omitting the Intermediate CA certificate when configuring Nginx or Tomcat β€” symptom: works fine in Chrome and curl (both do AIA fetching), fails on all Android and iOS clients with javax.net.ssl.SSLHandshakeException: java.security.cert.CertPathValidatorException: Trust anchor for certification path not found β€” fix: concatenate leaf cert + intermediate cert in your ssl_certificate PEM file, in that order; verify chain completeness with openssl s_client -connect host:443 -showcerts
  • βœ•Mistake 3: Using CN (Common Name) instead of SAN (Subject Alternative Name) for hostname binding in new certificates β€” symptom: javax.net.ssl.SSLPeerUnverifiedException: No subject alternative names present in Java 8u181+, and ERR_CERT_COMMON_NAME_INVALID in Chrome 58+ β€” fix: always include the hostname in the SAN extension (dNSName) when generating CSRs; CN alone has been deprecated since RFC 2818 in 2000 but tools still let you do it
  • βœ•Mistake 4: Storing keystore passwords in environment variables or application.properties β€” symptom: no immediate error, but credentials surface in process listings (ps aux), Docker inspect, and log aggregation when the app prints its config at startup β€” fix: use Vault Agent sidecar injection or Kubernetes External Secrets to mount the password as a file, read it once at startup, then zero out the char[] immediately after KeyStore.load()

Interview Questions on This Topic

  • QWalk me through what happens, step by step, when a Java service makes an HTTPS call to an external API and the TLS handshake fails with PKIX path building failed. What are the three most likely root causes in production, and how do you diagnose each one without restarting the service?
  • QYour new microservices architecture has 80 services doing service-to-service calls. You need to choose between API key authentication and mTLS for service identity. The security team wants mTLS. The platform team says the operational overhead is too high. How do you resolve this, what does your certificate lifecycle automation look like, and what certificate TTL do you choose and why?
  • QA developer on your team says: 'I configured OCSP stapling in Nginx for our internal services so we can revoke certificates immediately if a key is compromised.' What's wrong with this approach for an internal microservices mesh, and what would you replace it with?
  • QYour internal CA root certificate is valid for 10 years. One of your engineers suggests making it 20 years to reduce operational burden. What are the security and operational arguments for and against a long-lived Root CA certificate, and how does the offline key storage model affect your answer?

Frequently Asked Questions

Why does my HTTPS connection work in the browser but fail in my Java application with PKIX path building failed?

Your browser uses the OS trust store and performs AIA fetching to download missing Intermediate CAs automatically β€” your JVM does neither by default. The JVM checks only its own cacerts trust store (or whatever SSLContext you configured), and if the server's certificate chain is incomplete or the Root CA isn't in cacerts, it hard-fails. Fix it by either importing the missing CA into your JVM's cacerts with keytool -import, or building a custom SSLContext backed by a truststore that contains the CA. For internal CAs, always use a custom truststore β€” never import internal CAs into the global cacerts.

What's the difference between a keystore and a truststore in Java TLS?

A keystore holds your own private key and certificate β€” your identity. A truststore holds certificates of CAs you're willing to trust β€” your list of acceptable identities. In a standard HTTPS client you only need a truststore. In mTLS you need both: the truststore to validate the server's cert, and the keystore to present your own cert to the server. The JVM uses the same KeyStore class for both β€” the distinction is purely how you wire it into KeyManagerFactory versus TrustManagerFactory.

How do I rotate a TLS certificate in production without dropping active connections?

Add the new certificate to your load balancer or reverse proxy alongside the old one before the old one expires β€” most tools (Nginx, HAProxy, AWS ALB) support multiple certificates per listener and select by SNI. Keep both active simultaneously for at least one full connection timeout window (typically 15-30 minutes). Once traffic has drained from the old cert's sessions, remove it. For mTLS client certs, the server's truststore must contain both old and new CA certs during the rotation window. Automate the whole sequence with cert-manager or Vault's PKI Secrets Engine β€” manual rotation is how you create 3 AM incidents.

What actually happens when a Root CA certificate expires, and how do you plan for it?

When a Root CA expires, every certificate it signed β€” and every Intermediate CA it signed β€” becomes immediately invalid from the perspective of clients that enforce expiry on the trust anchor. This is not theoretical: in May 2021, the AddTrust External CA Root expired and broke thousands of services whose clients validated the full chain including the root, despite the same root cross-signing under a newer CA. The fix isn't just renewing the root β€” you must update every client's trust store to include the new root before the old one expires. For internal PKI, start planning Root CA rotation at least 12 months before expiry: new root issuance, cross-signing with the old root, rolling out the new root to all truststores, then cutting over. Don't wait for the 30-day alert.

πŸ”₯
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousEncryption Algorithms Explained: AES, RSA, DES and More
Forged with πŸ”₯ at TheCodeForge.io β€” Where Developers Are Forged