Tuesday, February 28, 2023

Routing TCP over Envoy using HTTP CONNECT

Tunneling TCP traffic using the L4 proxy capabilities of Envoy works well, but due to the nature of TCP, very little metadata useful for routing can be propagated via the TCP protocol itself. Using the HTTP CONNECT verb however, it is possible to instruct a proxy to tunnel the subsequent data as raw TCP to some target without interpreting it as http or some other L7 protocol. The way it works is listed below:

  1. A caller A wants to send some TCP traffic to a service B.
  2. The caller A calls some proxy P by making an HTTP request with the following header: CONNECT http://<address_of_B>[:port] HTTP/1.1
  3. The proxy P opens a TCP connection to the address and the optional port, and keeps the connection from A open.
  4. The caller A then uses its connection to P to stream the TCP payload it needs to send to B. P relays this traffic to B.
In the above, P is said to terminate the HTTP CONNECT. Equally well, it could be configured to propagate the HTTP CONNECT instead of terminating it, proxying everything it receives to a downstream proxy (upstream, if we use Envoy terminology). The final proxy in this chain would then terminate the HTTP CONNECT and forward the request to the target.

The elegance in this approach is that by encapsulating the request in an HTTP shim, we open up HTTP headers as a mechanism for specifying routing directives that the intermediate proxies could use. If the caller uses TLS, they can specify the serverName in TLS headers and use SNI for routing the request. The actual target address of B need not even be routable from A - it only needs to be routable from the final proxy in the chain (the one that terminates the HTTP CONNECT). With HTTP/2, the CONNECT header even allows a URL path. I'm not sure but perhaps this URL path too could be used for routing purposes just as with regular requests. With HTTP/2, multiple TCP streams could be multiplexed over a single HTTP/2 connection, achieving improved resource usage and better latencies when reusing connections from a pool.

The obvious downside is that the caller A would need to know the mechanics of HTTP CONNECT and have a dependency on it. But this is a small price to pay for not having to deal with TCP routing.

Envoy has supported HTTP CONNECT for a few years now (possibly since 1.14.x). Here is a small sample configuration which uses two Envoys, one propagating the HTTP CONNECT and the other terminating it, to route TCP traffic to a destination.

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address: { address: 127.0.0.1, port_value: 10000 }
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: connect_tcp
              domains: ["fubar.xyz:1234"]
              routes:
              - match: { headers: [{name: ":authority", suffix_match: ":1234"}], connect_matcher: {} }
                route: { cluster: ssh, upgrade_configs: [{upgrade_type: "CONNECT", connect_config: {}}] }
              - match: { prefix: "/" }
                route: { cluster: ssh }
            - name: local_service
              domains: ["*"]
              routes:
              - match: { prefix: "/" }
                route: { cluster: ssh }
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          upgrade_configs:
          - upgrade_type: CONNECT
          http2_protocol_options:
            allow_connect: true
  - name: listener_1
    address:
      socket_address: { address: 127.0.0.1, port_value: 9000 }
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: AUTO
          route_config:
            name: local_route1
            virtual_hosts:
            - name: connect_fwd
              domains: ["fubar.xyz", "fubar.xyz:*"]
              routes:
              - match: { connect_matcher: {} }
                #### route: { cluster: conti, upgrade_configs: [{upgrade_type: "websocket"}]}
                route: { cluster: conti, timeout: "0s"}
              - match: { prefix: "/" }
                route: { cluster: conti1 }
            - name: local_service
              domains: ["*"]
              routes:
              - match: { prefix: "/" }
                route: { cluster: conti1 }
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          upgrade_configs:
          - upgrade_type: CONNECT
          http2_protocol_options:
            allow_connect: true
  clusters:
  - name: ssh
    connect_timeout: 0.25s
    type: STATIC
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: ssh
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 22
  - name: conti
    connect_timeout: 0.25s
    type: STATIC
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: conti
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 10000
  - name: conti1
    connect_timeout: 0.25s
    type: STATIC
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: conti
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 8000
(More explanation to follow.)

No comments: