Pod Autoscaling
In this section, you will learn how to leverage Kubernetes Horizontal Pod Autoscalers (HPAs) to scale confidential workloads based on average usage of CPU and memory inside the Confidential Pod
Overview
When launching a Confidential Pod in EKS, using the Anjuna Kubernetes Toolset for AWS EKS, two objects are created in the EKS node:
-
A Launcher Pod - used to launch the application container in an AWS Nitro Enclave, and to provide the mechanism to interact with that application container.
-
The enclave that runs the actual application container.
If we want to use K8s HPA to automatically scale up or down Confidential Pods,
we cannot use the standard Kubernetes metrics, such as cpu
and memory
.
This is because the HPA Controller will query the CPU and memory of the Launcher Pod,
which is only used as a proxy to the enclave,
not the CPU and memory of the application container running in the enclave.
The Anjuna Kubernetes Toolset for AWS EKS introduces an additional set of custom metrics to monitor the resources of the application container that is running in the enclave.
Enclave metrics
The Anjuna Runtime periodically collects CPU and memory usage of the AWS Nitro, and exports them via the Anjuna Launcher.
This allows external entities to query, process and aggregate those metrics, and use them to perform tasks such as autoscaling enclave workloads based on CPU and memory usage of the enclave.
The following enclave metrics are currently available:
-
nitro_enclave_cpu_usage
: Average of the usage CPU across all CPU cores of the enclave in decimal format (0.0 - 1.0). For example, 0.5 means 50% CPU usage. -
nitro_enclave_memory_usage
: Average memory usage of the enclave in decimal format (0.0 - 1.0). For example, 0.5 means 50% memory usage. Note that the memory usage of the enclave also includes the enclave filesystem, as it is mounted in-memory.
In both cases, the average usage reported by enclave metrics include both the application and the Anjuna Runtime.
Enclave metrics are always exported by default in EKS.
Metrics are available on port 59090
of the AWS Nitro Pod,
and can be queried via the /metrics
HTTP endpoint.
Exposing enclave metrics on a different port
By default, the enclave metrics are exposed on port 59090
of the Confidential Pod.
If you want to expose the enclave metrics on a different port,
you can do so by adding the anjuna.k8s.io/metricsPort
annotation to the Pod template spec:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
nitro.k8s.anjuna.io/managed: "yes"
annotations:
nitro.k8s.anjuna.io/metricsPort: "9091" # custom metrics port
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
In this example, the Pods created by the nginx
Deployment will expose metrics on port 9091
(line 18) instead of the default 59090
.
Save the file above as deployment.yaml
, and create the Deployment by running:
$ kubectl apply -f deployment.yaml
Anjuna Metrics Server
The enclave metrics in EKS are exposed as custom metrics to the Kubernetes API. Since this leverages the Custom Metrics API, a metrics server is required to implement the API and expose the metrics.
The Anjuna Metrics Server is a component of the Anjuna Kubernetes Toolset for AWS EKS that implements the Custom Metrics API.
Its main responsibility is to fetch and aggregate the metrics from the enclave, and expose them to the Kubernetes API, so that components like the Horizontal Pod Autoscaler (HPA) can use them to scale workloads based on the average usage of CPU and memory inside the enclave.
The Anjuna Metrics Server is deployed by default as part of the Anjuna Kubernetes Toolset for AWS EKS installation.
If your cluster already implements a custom metrics server, it can be reused to expose the enclave metrics. Learn more in Integrating with existing metrics pipelines.
Using HPAs
Given that the enclave metrics are collected by a metrics server and exposed via the Custom Metrics API, HPAs can be configured to scale workloads based on the average usage of CPU and memory inside the enclave.
In the example below, the HPA is configured to scale a Deployment nginx
based on the average usage of CPU and memory inside the enclave,
whenever the average CPU usage exceeds 75% or the average memory usage exceeds 50%.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods
pods:
metric:
name: nitro_enclave_cpu_usage
target:
type: Utilization
averageValue: 0.75
- type: Pods
pods:
metric:
name: nitro_enclave_memory_usage
target:
type: Utilization
averageValue: 0.5
Since nitro_enclave_cpu_usage
and nitro_enclave_memory_usage
are custom metrics,
the resulting HPA spec differs from one that acts on standard metrics, such as cpu
and memory
:
instead of specifying metrics of type Resource
, the HPA spec specifies metrics of type Pods
(L13 and L20).
Save the file above as hpa.yaml
, and create the HPA by running:
$ kubectl apply -f hpa.yaml
Verify that the HPA was created successfully by running:
$ kubectl describe hpa nginx
The output should resemble the following:
Name: nginx
Namespace: default
Labels: <none>
Annotations: <none>
Reference: Deployment/nginx
Metrics: ( current / target )
"nitro_enclave_cpu_usage" on pods: 0 / 750m
"nitro_enclave_memory_usage" on pods: 0 / 500m
Min replicas: 1
Max replicas: 5
...
Deployment pods: 1 current / 1 desired
Due to the way that Kubernetes serializes quantities, the values in the averageValue will be displayed as
750m (line 7, instead of 0.75 ) and 500Mi (line 8, instead of 0.5 ) in the HPA spec.
These values are equivalent and do not affect the behavior of the HPA.
|
Mutating HPAs automatically
For convenience, the Anjuna Nitro Webhook can automatically mutate HPAs to transform standard metrics into enclave metrics, so that pre-existing HPA specs can be used without modification.
Similar to the way the Anjuna Webhook mutates Pods to run them inside of AWS Nitro Enclaves,
you can add the label nitro.k8s.anjuna.io/managed: "yes"
to the HPA spec to have it automatically translate
cpu
and memory
metrics into nitro_enclave_cpu_usage
and nitro_enclave_memory_usage
metrics, respectively.
In the example below, the HPA is configured to scale the Deployment nginx
based on the standard metrics cpu
and memory
.
The addition of the label nitro.k8s.anjuna.io/managed: "yes"
(L6) will cause the Anjuna Webhook to mutate the HPA spec
to use the enclave metrics instead.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx
labels:
nitro.k8s.anjuna.io/managed: "yes"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 50
The following mutations will take place:
-
The metric
cpu
with a typeResource
and targetaverageUtilization
of75%
is translated to the metricnitro_enclave_cpu_usage
of typePods
with a targetaverageValue
of0.75
. -
The metric
memory
with a typeResource
and targetaverageUtilization
of50%
is translated to the metricnitro_enclave_memory_usage
of typePods
with a targetaverageValue
of0.5
.
The original metrics in the HPA must be either cpu or memory metrics with an averageUtilization target type.
|
The final mutated HPA spec will look like this:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx
labels:
nitro.k8s.anjuna.io/managed: "yes"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods
pods:
metric:
name: nitro_enclave_cpu_usage
target:
type: Utilization
averageValue: 0.75
- type: Pods
pods:
metric:
name: nitro_enclave_memory_usage
target:
type: Utilization
averageValue: 0.5
Configuring HPAs for the enclave and for the Launcher
The Anjuna Launcher is responsible for managing the lifecycle of the AWS Nitro Enclave.
Since the Launcher itself does not run inside of an enclave,
the standard cpu
and memory
metrics can be used to scale the Launcher.
Therefore, it’s possible to configure an HPA with separate targets for the enclave and for the Launcher,
by combining cpu
, memory
, nitro_enclave_cpu_usage
and nitro_enclave_memory_usage
metrics,
as shown in the example below:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-launcher-and-enclave
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 90
- type: Pods
pods:
metric:
name: nitro_enclave_cpu_usage
target:
type: Utilization
averageValue: 0.75
The HPA spec above will scale up the Deployment nginx
whenever the Launcher CPU usage exceeds 90% (lines 13-18),
or the enclave CPU usage exceeds 75% (lines 19-25).
Integrating with existing metrics pipelines
The enclave metrics can be integrated into existing metrics pipelines, for clusters that already have a custom metrics server that implements the Custom Metrics API.
In such cases, the Anjuna Kubernetes Toolset for AWS EKS can be installed without the Anjuna Metrics Server,
by setting the Helm chart parameter metricsServer.enabled
to false
:
$ helm install anjuna-tools helm-charts/anjuna-tools \
--set metricsServer.enabled=false
Make sure that the agents and scrappers are configured to scrape the enclave metrics from the Confidential Pod. An additional component might be required to convert the enclave metrics, which are exported as a JSON object, to the format expected by the metrics pipeline in place (e.g. via a sidecar).
You can verify if your cluster already provides a custom metrics server by running:
kubectl get apiservices | grep -e ".*.custom.metrics.k8s.io"
An empty output indicates that no custom metrics server is available.