Skip to content

bug: gateway logs InvalidContentType TLS errors every 5s from kubelet probes #897

@sjenning

Description

@sjenning

Agent Diagnostic

Investigated the gateway pod logs and Helm statefulset template. The liveness and readiness probes use tcpSocket on the gateway's gRPC port. When TLS is enabled, kubelet opens a raw TCP connection and sends plaintext bytes, which the rustls TLS acceptor rejects with InvalidContentType. This produces ERROR-level log lines every 5 seconds (matching the liveness probe periodSeconds: 5).

Files examined:

  • deploy/helm/openshell/templates/statefulset.yaml — probe definitions (lines 122-135)
  • deploy/helm/openshell/values.yaml — probe intervals (periodSeconds: 5 for liveness)
  • crates/openshell-server/src/lib.rsis_benign_tls_handshake_failure only matches UnexpectedEof and ConnectionReset, not InvalidData
  • crates/openshell-server/src/tls.rs — TLS acceptor enforces mTLS when allow_unauthenticated is false

Root cause: tcpSocket probes are protocol-unaware — they just check port reachability. The plaintext TCP bytes fail the TLS handshake. Kubelet's httpGet probes with scheme: HTTPS would perform a proper TLS handshake, but kubelet cannot present a client certificate, so httpGet only works when gateway auth is disabled (disableGatewayAuth: true) or TLS is off entirely.

Description

Actual behavior: The gateway logs ERROR every ~5 seconds:

ERROR openshell_server: TLS handshake failed error=received corrupt message of type InvalidContentType client=10.42.0.1:18166

Expected behavior: Kubernetes health probes should not generate ERROR-level log noise.

Reproduction Steps

  1. Deploy a gateway with default settings (TLS enabled, mTLS enforced)
  2. Wait for the pod to become ready
  3. Observe gateway logs: kubectl logs -n openshell openshell-0
  4. InvalidContentType errors appear every ~5 seconds

Environment

  • Kubernetes: k3s (embedded in openshell cluster container)
  • Probe config: tcpSocket on port 8080, periodSeconds: 5 (liveness)
  • TLS: enabled with mTLS (default)

Proposed Fix

Two-part fix:

  1. Helm: Use httpGet probes (/healthz for liveness, /readyz for readiness) when client certs are not required (disableTls or disableGatewayAuth). Fall back to tcpSocket when mTLS is enforced since kubelet cannot present a client certificate.
  2. Server: Add InvalidData to is_benign_tls_handshake_failure in lib.rs so the remaining tcpSocket probe errors in the mTLS case are logged at DEBUG instead of ERROR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions