At Semantics3, we are heavily invested in micro-services. We have a suite of light-weight HTTP services deployed that support not just our user-facing products but also different parts of our Data pipeline. Suffice it to say that HTTP is the lingua franca of most of our internal tools and services.

However, when the time first came to expose some of these services over the Internet, we hit a snag. We realized that each of them needed to be capable of tackling a bunch of problems:

  • Force TLS transport
  • Authentication
  • Access Control
  • Rate-limiting

Note that these challenges were specific to the Internet-facing services only. For the services that were consumed exclusively from within our private cloud on AWS, we were happy to just use plain-HTTP for transport, not have authentication/throttling and let EC2 Security Groups handle access control.

Application Middleware

Web frameworks try to solve these problems using what they call middleware (see here, here or here).

An application middleware is a function (or a class or some other language construct) that gets mutative access to the HTTP request and response during the request lifecycle. A single request goes through a middleware chain in sequence. The middleware is, therefore, a powerful, composable building block of a HTTP service. That is the reason why this concept has spawned an entire ecosystem of plugins for each framework that supports it (example). These plugins handle a variety of use-cases including the ones that I listed earlier.


Not every middleware is the same. Some are application-specific (logging, telemetry etc.) while others depend more on the consumer. When we took a closer look at the problems that we wanted to solve, we realized that they were essentially consumer-specific and had little to do with the actual application itself. Using application middleware would have required us to handle each request conditionally depending on the client. For instance, authentication should be bypassed if the request is from another internal service whereas it must be enforced if the request is from over the Internet. We felt our services could do without this complexity. What we needed was for these problems to be solved at another place that had access to the request lifecycle just like application middleware did — at a different layer!


In this model, a bunch of different layers isolate the micro-service from the Internet. Each layer is a reverse proxy HTTP service that acts on an incoming request or an outgoing response and is even capable of terminating a request before it reaches the backend service. Once the work in a layer is done, the request is proxied to the next layer. Hence, a middleware layer is conceptually similar to application middleware.

Currently, we use two middleware layers:

  • TLS terminator
  • Gateway

TLS terminator

The first layer which is actually the only component to be exposed to the Internet is responsible for decrypting incoming and encrypting outgoing HTTP traffic. It is configured with our SSL certificate and private key.

There are a bunch of tools that you can use to do this.


It receives requests over plain-HTTP from the layer above and performs a variety of functions:

  • Routing: It is responsible for finding out the backend micro-service to proxy to. This is based either on the HTTP Host header or more commonly — the request URI.
  • Identity: For stateful clients, it issues tokens when valid credentials are presented and validates tokens sent with requests. For stateless clients, it validates the credentials sent along with each request to a protected backend service endpoint.
  • Access Control: Once their identity is resolved, it checks if the client is allowed to access the service endpoint in question.
  • Rate-Limiting: It ensures that an authorized client does not abuse the service. It does this by capping the number of requests by a client within a small time window. In this case of our API product, this is actually a first-class feature.
  • Load Balancing: It allows the backend service to run as multiple instances and still be served off a single Internet URI. In this case, the Gateway is responsible for picking the right backend service instance to proxy each request to.

The gateway handles each of these use-cases based on the rules that it is configured with. While it is true that each of these functions can be extracted into layers of their own, we haven’t yet found a need for it.

You may want to make sure that the benefit of adding a new later outweighs the latency cost of adding an additional hop between the client and your service.

You can build a gateway for your own application using any programming language that has a robust HTTP proxy library.

Why Layers?

  • The system resembles a Unix pipeline with each layer assuming a single responsibility — which for the micro-service (the last layer) is just managing its own business logic.
  • Different members in the team can own different layers. For eg. Backend service authors in our team are almost fully abstracted from the Infrastructure (layers) fronting their services.
  • Different layers can be written in different programming languages letting you pick the right tool for the job. If you run your middleware layers from within isolated Linux containers like we do, you also get the added benefit of swapping out/upgrading layers easily.
  • You can scale your middleware independent of your application.
  • The working of a layered system is easier to reason with. It is easier to debug and hence easier to maintain.


This system of having middleware as layers has served us well so far. We have been using it in production over the last few years to serve different kinds of service consumers over the Internet: